hdr data project reginald gardner ppt show

26
Gardner and Flek Data Analysis Human Development: Income and Education

Upload: ross-flek

Post on 07-Aug-2015

178 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Hdr data project reginald gardner ppt show

Gardner and Flek Data Analysis Human Development: Income and Education

Page 2: Hdr data project reginald gardner ppt show

Instructor’s Note:

This presentation is meant to demonstrate an example of an Honors student’s

work completed while enrolled in Professor Flek’s “Introduction to Statistics”

course at Hostos Community College of the City University of New York, and under

his advisement for the purposes of the Honors Program membership.

The last slide contains a poster presentation created by the student for a

Hostos academic event.

The course utilizes a project-oriented approach to the study of Statistics and

Data Analysis. It is a first semester college level Statistics course. Even though

this is the work of an Honors student, and is, therefore, more advanced, all students

in the course had to complete such a project.

The depth of the project is intended to convey the high level of student

engagement and understanding that may be achieved using the project-oriented

approach, even at a basic level Statistics course.

***I would like to thank Reginald Gardner for allowing me to share his work

Gardner and Flek Data Analysis Human Development: Income and Education

Page 3: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

2

Abstract:

Education is a primary speaking point in most elections from the very large to the very local. It is the

one topic which is harbors more opinions than there were ever candidates and to which every person has an

emotional affiliation. It is important that education around the world and within the United States is put into

perspective, that as a Nation we may understand the situation more completely by observing our standing in

a larger context. The research conducted here helps to communicate how education around the planet is

fairing. The research then attempts to compare education in the United States to that of the other countries

of the world, ignoring both primary and secondary school administrative and grouping structures as well as

socioeconomic status, race, and other measurements of inequality. The educational indices are out into the

context of economics by looking for relationships between them and the GNI of countries. The research

conducted here helps to communicate how education around the planet is fairing. The research then

attempts to compare education in the United States to that of the other countries of the world, ignoring

both primary and secondary school administrative and grouping structures as well as socioeconomic status,

race, and other measurements of inequality. The Educational indices are out into the context of economics

by looking for relationships between them and the GNI of countries. The research has been conducted

using elementary statistical methods, data from UNESCO was analyzed and presented with visual accuracy

(Ignoring participants with null data). The United Nations Organization for Education, Science and Culture

founded on 16 November 1945, and its research methods are renowned for their accuracy. Anyone with 3

months experience studying statistics could use the same methods and come up with neither similar if nor

completely congruent data. All countries underneath today’s averages do need to rethink their strategies to

maintain themselves in a future rich in thought. Lower tiered countries on the rise must continue as a better

future is far more likely should they succeed in motivating their constituents and maintaining high

retention in their academic systems. Furthermore, it can be assessed that no country can solve their qualms

with education by simply providing more funding. The study shows that there is no realistic predictable

outcome.

Gardner and Flek Data Analysis Human Development: Income and Education

Page 4: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

3

Statistic Terminology and Human Development Indicator Definitions:

Mean - a quantity having a value intermediate between the values of other quantities; an average,

expressed here as μ. Standard deviation- shows how much variation there is between the values and

mean. A low standard deviation indicates that the data points tend to be very close to the mean; high

standard deviation indicates that the data points are spread out over a large range of values. Here

expressed as σ

Skewness- a measure of the asymmetry “a negative skew indicates most of the values are to the right of the

mean. A positive skew indicates that most of the values are to the left of the mean. A zero value indicates

that the values are evenly distributed.”

Here expressed as γ

Range- The difference between the maximum and minimum values of a data set. Here expressed as range

Coefficient of Variance - The ratio of σ to the μ

Coefficient of Variation - The coefficient of variation (CV) is defined as the ratio of the standard

deviation to the mean Coefficient of Correlation - Describes how well values in a statistical model with 2

variables align with the regression line. Here expressed as r.

Critical r – Describes the area in which a correlation coefficient could exist if the correlation is completely

coincidental. If the r value lies within this range there is not enough data to assess a true correlation.

Combined gross enrolment in education (both sexes) (%)- The number of students enrolled in primary,

secondary and tertiary levels of education, regardless of age, as a percentage of the population of

theoretical school age for the three levels. Expected Years of Schooling (of children) (years)- Number

of years of schooling that a child of school entrance age can

expect to receive if prevailing patterns of age-specific enrolment rates persist throughout the child’s life.

Mean years of schooling (of adults) (years) –

Average number of years of education received by people ages 25 and older, converted from education

attainment levels using official durations of each level.

Gardner and Flek Data Analysis Human Development: Income and Education

Page 5: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

4

Public expenditure on education (% of GDP) (%) - Total public expenditure (current and

capital) on education expressed as a percentage of GDP.

GNI per capita in PPP terms (constant 2005 international $) (Constant 2005 international $) -

Aggregate income of an economy generated by its production and its ownership of factors of production,

less the incomes paid for the use of factors of production owned by the rest of the world, converted to

international dollars using purchasing power parity (PPP) rates, divided by midyear population.

Introduction:

Human Development is a complex topic not easily explained or enumerated. However, in the last 30

years researchers at the UN have constructed a way to measure and compare “human development” in over

180 countries around the world. Human development is a concept which seeks to replace the standard

evaluation of nation’s powers through fiscal means. Human development measures more than monetary

wealth of nations; It measures efficiency, equality, equity, sustainability security, freedom, social progress

and the enhancement of human capability.

What is most concerning is education across the world as measured by the UN. Education is a

fundamental part of the future of any nation. By comparing and analyzing data on the world’s top and

most mobile countries one can best understand

how education is working worldwide. With this information, one could speculate how

economic trends will shift, how International affairs will change, how quickly countries may become

world super powers.

The data from three groups of countries will be analyzed to achieve these goals: That of the “Highest

Ranked” countries (by human development index), The “Top Movers” (By Percentage), and the “Top

Movers” (by overall value). The Variables analyzed will include: Each country's , expected years of

schooling, mean years of schooling, gross population enrolled in education and public expenditure on

education.

Furthermore, the world data on GNI will be applied as an addendum to the previously evaluated data, placing it into a

Gardner and Flek Data Analysis Human Development: Income and Education

Page 6: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

5

different context. GNI will be compared to across the world and then subsequently across the other

HDIs to search for any sort of relationships.

These are the only variables that would accurately depict any trends between countries of far

different and similar economic and political standing. Patterns expected to be displayed within the Top

moving groups should be similar to the behaviors of the world’s highest ranked. The world’s highest

ranked countries are expected to display similar trends.

During research, it was found that about seven countries consistently lacked voluntary data:

Tuvalu, Somalia, San Marino, Nauru, Monaco, The Marshall Islands, and The Democratic People’s

Republic of Korea (North Korea).

Univariate Analysis (Descriptive Statistics): Combined Gross Enrolment in Education

Figures 1.A – C. Combined Gross Enrolment in Education (both sexes) (%) (2009)

The sample size for this

indicator of

human development

consisted of 86 countries. The

mean of the data is at

approximately 78.13%

enrollment with a standard deviation of 17.85%. No country had less than Eritrea’s 29.6% or more than that of

Australia’s 112.1%, resulting in a range of 82.5%. The countries' best representative of this mean are Algeria,

Bulgaria, and The United Arab Emirates (UAE) With the lowest coefficient of variance within the group of

indicators the data here seem to be most concurrent. The histogram shows a close to normal distribution with a

left skewness of about - 0.0375. Majority of countries actually had an enrollment rate greater than 50%. The box

plot verifies this information with the huge observable distance between the minimum and first quartile.

Quartiles are even spaced in a normal standard distribution. The modified box plot indicates 3 very low

percentage outliers, - three data points that are unusually low. These observations speak volumes about the

countries; how a country's culture values education within and outside of the age ranges one is expected to

receive it, and how well a country retains its population’s interest in education are all too important when

discussing human development.

Gardner and Flek Data Analysis Human Development: Income and Education

Page 7: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

6

Expected Years of Schooling

The sample provided provides a lot of information on how individuals are expected

to be educated throughout the world. The mean of the data here is 12.2 years. In perspective, in the

United States 12 years of schooling equates to 11th grade, just short of completing high school.

Competition throughout the world, however, remains strong as over 70% of countries expect their children

to complete between 9 and 15 years of schooling. The data again is quite consistent with a coefficient of

variation sitting at a low 24.74%. The countries that best represent the mean are that of Botswana,

Thailand, and Samoa. In stark contrast to the mean Somalian’s are only expected to partake in about 2.4

years of formal education while countries such as Australia, New Zealand, Iceland, and Ireland expect 18.

This results in a relatively huge range of 15.6 years, well over a decade.

Figures 2.A – C. Expected Years of Schooling (of children) (years) (2011)

The subsequent histogram shows a slightly stronger left skew of about -0.0526. This indicated that

countries have relatively high expectations of their peoples and in their ability to retain students within

their respective educational systems. The box plots provide consistent visual clarification and support to the

claim as the distance between the Min. and Q1 is almost equal to the distance between Q1 and the Max. The

modified box plot identifies two outliers, the minimum and the second nearest data point, showing that

generally no country expects less than five years from its peoples.

Gardner and Flek Data Analysis Human Development: Income and Education

Page 8: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

7

Mean Years of Schooling

Figures 3.A – C. Mean years of schooling (of adults) (years) (2011)

The sample size of this more realistic measure of the world’s relative education is 187; (99.47% of

the population) one country short of the complete data set, almost every statistically relevant country is

accounted for. The mean of this data set is

7.6 years, only 62.29% of the mean of expected years of schooling. This is the first evidence of a

worldwide education problem as about 70% of the world only spends between about 4.8 to 10.6 years in

academia. The countries that best represent the mean,

in this case are Venezuela, Ecuador, and Paraguay.

The countries whose citizens are most disadvantaged are those of Mozambique who attend school for little over one year

(1.2 years) while Norwegians study for more than 12.(12.6) The range of years of schooling

worldwide is 11.4, more than four thousand days. Unlike enrollment and expected years of schooling,

which considers elders and foreigners and boasts a recommended amount of schooling. The data

represented here shows staggering differences in the actual amount of time

citizens remain in school. Considering both the niche for intellectuals to fill and the developmental

processes of human beings this alone provides for a very easily tiered society. The Minimum of 1.2 years

couldn’t compete in an educational environment with the Q1 value of 5.3 years. However, this group holds

the highest coefficient of variation 38.88%, meaning its data is spread far apart; extremes and outliers are

to be expected.

Gardner and Flek Data Analysis Human Development: Income and Education

Page 9: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

8

The histogram is again skewed left at -0.1318 with an abundance of data around the mean. One does

notice, however, that there is much more data less than or equal to two years of schooling than that of more

than 10 years. The box plots reflect the minor low end curve with a relatively even space between the Min.-

Q1 and Q1-Q3. The modified box plot shows no outliers. With only 42% of the data standing below that of

the mean, the more even distribution of this data is apparent.

Public Expenditure on Education

Figures 4.A – C. Public expenditure on education (% of GDP) (%) (2007)

The sample size for this indicator is 100. 53.2% of the population has shared its data on the % of GDP

spent on education. The mean is 4.6%. 4.6% of GDP is a fair amount of money for any nation to spend on

education. 4.6% of the US’s GDP is just short of $700b. About 70% of the world follows this trend and

dedicates from 2.8% to 6.4% of its GDP to education. Countries' best representatives of this mean are

Germany, Malaysia, and Burkina Faso. The data, however, has a large coefficient of variation, similar to that

of Mean Years of Schooling, so the data will similarly be spread.

The UAE spent the least on their education with just under 1% of GDP while Cuba almost spends 12%. With a range of

over 10% of a country’s GDP one could observe similar problems to that of the Mean years of schooling among countries. It,

Gardner and Flek Data Analysis Human Development: Income and Education

Page 10: Hdr data project reginald gardner ppt show

other. All the initial tests are carried out with outliers and share a common significance of 0.05.

Human Development in Terms of Education and Income

9

demonstrates how much countries care for the education of their citizen from an economic and political viewpoint.

The first observed standard distribution with a right skew of 0.0568. A vast majority of the data

points are close to the mean with what seems to be one obvious outlier. The box plot reflects the range,

making visible the distance between the Mi. and Max., however, the length of the plot between Q3 and the

Maximum occupying such a large percentage of the space is perplexing. The modified variant restores the

symmetry observed in the histogram by isolating Cube, the only outlier.

Bivariate Analysis (Correlation and Regression):

More interesting than the worldwide trends are their relationships with each other. Correlation

analysis consists of using two variables to determine whether they are related and how closely so they may

be. With that in mind the sample sizes for correlation analysis will be marginally smaller than that of the

descriptive statistics of each indicator, as data is required for each country in both indicators to be valid.

Six situations will be examined as the four indicators only allow for that many distinct matches.

Since all the indicators have to do with the same central focus, education, one could easily assume that

they all share positive linear relationships. Even so, there may be a surprising outcome as the indicators deal

with different aspects of statistics and society. For example the enrollment measurements can exceed 100%

because of their parameters and has more to do with citizens and foreigners in school than it impacts

whether or not they are expected to be there. The expenditure on education should bear little effect on any

of the other indicators as private institutions are a confounding variable that the other three share; in the

measurement of public expenditure, they are not accounted for. Expected years of schooling should have a

great effect of the mean years of education as well as the population enrolled as it serves as a pressure that

maintains retention in academia. These correlations, however, do not serve the purpose of clarifying

causality, they only exist to help one understand what measurements are closely related, directly or through

other variables, or completely independent of each

Gardner and Flek Data Analysis Human Development: Income and Education

Page 11: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income10

Combined gross enrolment in education (both sexes) (%) vs. Expected Years of Schooling (of children)

(years)

Figure 5. CGEE vs. EYS

The first analysis is that of combined gross enrollment of education (both sexes) and expected years of

schooling. The sample size consisted of 86 countries, which is rather large considering the parameters

required. The two indicators are very much related with a correlation coefficient of 0.98. The Y intercept of

the regression line in the scatter plot below is below the x axis indicating that with a projected zero years of

schooling students would still be enrolled. One might see that as a very favorable worldwide trend. The

shape of the correlation is linear with a positive slope. There is more than enough evidence available to

support the correlation as the coefficient exceeds the critical of 0.21. However, in a prediction example, a

country

with 100% enrollment should have about 14.2 expected years of schooling.

Gardner and Flek Data Analysis Human Development: Income and Education

Page 12: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income11

Combined gross enrolment in education (both sexes) (%) vs. Mean years of schooling (of adults) (years)

Figure 6. CGEE vs. MYS

The next indicators to be analyzed would be gross enrollment and mean years of schooling. The result

being close to the last correlation with a sample size of just 85, a correlation coefficient of 0.84, a linear shape

and a positive direction. Again, the Y intercept lay below the x axis, which is unrealistically odd but very

statistically relevant. There is enough data to support the correlation as the coefficient exceeds the critical r

of 0.21. In the situation where zero years of school are expected from citizens, there would still be students

enrolled. In another unrealistic predictive scenario when there is a 100% student enrollment in a country's

education system the mean years of schooling should be close to 16.9.

Gardner and Flek Data Analysis Human Development: Income and Education

Page 13: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

12

Combined gross enrolment in education (both sexes) (%) v. Public expenditure on education (% of GDP)

(%)

Figure 7. CGEE vs. PEE

In the last analysis of public expenditure, it is paired with public expenditure on education (in terms of % of GDP). The

sample size provided in this analysis is a fairly small 59 countries. With a linear and still positive slope, the

correlation sits with a weak correlation coefficient of 0.27. This number sits just about the critical of 0.26

meaning, there is barely enough data to

confirm this correlation. The y intercept tells us that 0% of students are enrolled when the government of a country spends

about 2.5% of their GDP on education; unrealistic but very telling for a pair of indicators so poorly related.

Predictions from this data tell us that with 100% enrollment countries should be spending 5.52% of

their GDP. A reasonable request one might say, however, the predictions for a country using 10% of GDP

on education resulted in about

249% of the expected population attending school. Such an unlikely number gives very little weight to the correlation.

Gardner and Flek Data Analysis Human Development: Income and Education

Page 14: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

13

Expected Years of Schooling (of children) (years) v. Mean years of schooling (of adults) (years)

Gardner and Flek Data Analysis Human Development: Income and Education

Figure 8. EYS vs. MYS

The second set of correlations begins with Expected years and Mean years of education. With an astounding sample size

of 187, the analysis takes into account almost every country in the present population. This sample

represents the most statistically relevant amount of information available in the groups of correlation

analysis. The par has a very high-strength

correlation with a coefficient of 0.8, a linear shape, and a positive slope. The y intercept lays below the x axis indicating that a

country with mean years of schooling of zero still expects about 2.7 years. This tells us that on average a

country’s expectations of its citizens far exceed the realities. The predicted mean years of schooling when 16

are expected (K-15, or the U.S. Equivalent of an Associate Degree) is only 10.6 (U.S. Equivalent of some high

school). When the mean is 13 years (U.S. high school

completion) then the predicted expectations fall just short of 19 years (Completion of a master's degree). Both predictions are

reasonable and complement the data.

Page 15: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

14

Expected Years of Schooling (of children) (years) v. Public expenditure on education (% of GDP) (%)

Figure 9. EYS vs. PEE

Expected years of schooling and public expenditure on education, much like gross enrollment in education and public

expenditure has a slightly positive linear correlation with a coefficient of about 0.33. The sample size, at

100 countries, is just short of double. The correlation exceeds the bounds of critical R meaning the

correlation is valid and the y intercept lays at

about 2. In other words, when zero years of schooling are expected countries still spend about 2% of their

GDP on education. One might call this a waste, but more observations of the data are needed to be

conclusive. Two reasonable predictions can

easily test the data’s viability. If the expected years of schooling in a country are 13 (U.S. Equivalent of K-12)

then the country is likely to spend around 4.7% of GDP, a reasonable assessment. However, the inverse, a

country spending about 10% of GDP on education, is unrealistic; a country in that situation is predicted to

have an expectation of about 40 years of schooling. More than ½ of the average person’s life is unreasonable.

If data cannot give us decent predictions, it is interesting but not reliable for much.Gardner and Flek Data Analysis Human Development: Income and Education

Page 16: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

15

Mean years of schooling (of adults) (years) v. Public expenditure on education (% of GDP) (%)

Figure 10. MYS vs. PEE

The last pairing would be that of Mean years of schooling and public expenditure on education. The

sample size was a stead 100 countries. The correlation coefficient was extremely paltry in this case at 0.19,

the weakest correlation of any pairing of indicators. So ignoble in this case that it is within the range of the

critical r of 0.2 meaning that there is not enough data to

conclude any correlation, the following information and analysis of it could be based on completely

random happenstance. The regression line in this case has no relevant slop, is close to flat, but the

correlation is still linear in shape. Predictions ultimately will determine if further testing has a moderate

change for showing a conclusive regression. If public expenditures on education

of a country are 10% of GDP, it is predicted that students will remain in school for 50+ years. No

other predictions need to be made as that is completely unrealistic. The following scatter plot shows the

unreliable data.

Gardner and Flek Data Analysis Human Development: Income and Education

Page 17: Hdr data project reginald gardner ppt show

it can be said that the data was too almost normal. Mean years of schooling was represented by the same 35 countries with a

Human Development in Terms of Education and Income

16

Specified Group Comparison Analysis:

[Group includes: Top HDI rank, Top HDI movers (%), Top HDI movers (value), Top

Education index (EI) rank, Top EI movers (%), and Top EI (value)]

To understand the differences between the trends and correlations of the entire world we must

observe s specific part of the whole. The sample depicted will represent the top-ranked countries by both

HDI and EI standards, as well as the Top movers, those who have had significant positive change given data

from 1980 to 2010. These countries represent a large spectrum of the world, but all have one particular

common entity they are what others in similar scenarios should strive to be. The countries in this sample

include Norway, Australia, Netherlands, United States, New Zealand, Canada, Ireland, Liechtenstein,

Germany, Sweden, Afghanistan, Bangladesh, Benin, China, Islamic Republic of Iran, Mali, Morocco,

Myanmar, Nepal, Niger, Algeria, Egypt, Republic of Korea (South Korea), Tunisia, Turkey, United Arab

Emirates, Bahrain, Botswana, Burundi, Libyan

Arab Jamahiriya, Slovenia, Spain, Uganda, Yemen, and Rwanda. Their HDI rankings range from 1st to

185th, and they are by far the most diverse group of leaders in human development the world has known to

date.

The combined gross enrollment indicator had a sample size of 22 countries in this group with a mean of 81.75% and a

standard deviation of 20.68%. The countries ranged from Niger’s minimum of 31.3% to Australia’s max

of 112.1%, an 80.8% difference. A country that best represented the mean was the United Arab Emirates.

With a coefficient of variance of 25.3% and a

skewness of only -0.031, it can be said that most of the data was normal.

Expected years of schooling had a larger sample size of 35 with a mean of 13.04 years and a standard

deviation of 3.54 years. The countries ranged from Niger’s minimum of 4.9 years to that of 18 years shared

by the countries: New Zealand, Ireland, and Australia, Overall a 13.1 year difference. With a coefficient of

variance of only 27.16% and a moderate skew of -0.073

Gardner and Flek Data Analysis Human Development: Income and Education

Page 18: Hdr data project reginald gardner ppt show

However, the shape of regression for this pairing was linear and nearly flat but positive in slope. Expected years of schooling and

Human Development in Terms of Education and Income

17

mean of 7.67 years and a standard deviation of 3.74 years. Again, the countries ranged from Niger’s 1.4

years to Norway’s 12.6, leaving an 11.2 year gap. With a huge coefficient of variation, 48.76%, it can be

observed that the data is very much spread across the range. With a higher skew of 0.099 it can be said that

this data is very abnormal and perhaps not a good representation of the best in the world’s practices.

The expenditures on education similarly varied in results but with a far lesser sample size of 22. The

mean was 4.67% with a standard deviation of about 1.7%. A larger coefficient of variance, 36.37% indicate

widely spread data. The minimum any one of these countries spends id 0.9% by the United Arab Emirates;

and the most money in % of GDP is spent in Botswana, about

8%. With a skewness of about 0.012 the graph of this shows that there was a minor skew.

Further analysis of the indicators, comparing both world and group data provides one with the

insight on how much “better off” the countries in the group are. For all indicators, the means were higher

by small amounts; the ranges were lower, meaning less of a gap between the very well-off and the not so

wealthy. In some cases, with mean years of schooling, the coefficient variation was higher.

Specified Group Correlation and Regression Analysis:

[Group includes: Top HDI rank, Top HDI movers (%), Top HDI movers (value),

Top Education index (EI) rank, Top EI movers (%), and Top EI (value)]

The group indicated in this analysis is expected to share stronger correlations within their indicators as

they have a very common trait of either remaining on top or moving fairly well up the chain in the last 30

years. Public expenditure is expected to share very little correlation with any other indicator as it showed no

signs of correlation in the worldwide regression analysis.

Mean years of education and Public expenditure in education with a sample size of 22 countries within

this group, shared a very weak correlation with a coefficient of 0.036, a number within the range of critical r

making the sample not viable.Gardner and Flek Data Analysis Human Development: Income and Education

Page 19: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

18

public expenditures shared the same fate; with an equal sample size, the correlation coefficient lay at 0.309

well below the critical r. Even with a more moderate correlation and linear positive slope the data does not

provide evidence of a true correlation. Again when paired with gross enrollment in education not enough

evidence was found. The pair shared a sample of 16 countries, and the correlation coefficient was 0.187, a

weak positive linear regression line. It seems as if expenditure on education either has no correlation to the

other indicators or the relationship is always extremely frail.

When expected years of schooling were analyzed with mean years of schooling the results began to

change. With a much larger sample size of 35 countries, it was found that the two shared a correlation

coefficient of 0.924, a very strong linear positive

relationship. This pattern continued when gross enrollment in education was measured against

expected years of schooling. With a smaller sample size of 22 countries, the correlation coefficient remained

strong at 0.983, an almost perfect correlation. The regression line remained strong, positive and linear.

Lastly, Gross enrollment and mean years of schooling were analyzed to show rather similar results, with a

sample size of 22 and a correlation coefficient of 0.933. With the new data provided, it became more so

unlikely that public expenditure on education has any significant correlation with the other three indicators.

Gardner and Flek Data Analysis Human Development: Income and Education

Reginald Thomas Gardner Jr. is a Senior Student of Political Science at Binghamton University. After Interning with Mayor Bill de Blasio and Director Elizabeth Glazer of New York City's "Mayor's Office of Criminal Justice" he returns to graduate in 2015. He received his associate’s degree from Hostos Community College, following a diploma from The Bronx High School of Science, both located in his home borough. After High-school, he pursued a career in entertainment through musical performance and after 2 years of playing concerts throughout the city and producing an album, he decided to return to higher Ed. During his 2 years in community college he worked on several major academic and service initiatives

through intensive courses, the Hostos’ Honor’s program, and the Student Leadership Academy. This year at Binghamton University, he has devoted his time to students and the community as a club president, a mentor, a Student

Conduct Board member; and continuing to push Binghamton towards becoming the premier public university by 2020 through the Road Map. All of this while pursuing a minor in Education and certificate in Global and International Affairs. Last Month he was inducted as a Brother of Alpha Phi Omega, National public service fraternity, and was accepted to MPA programs at Both Binghamton and Brown University.

Dr. Ross Flek is currently conducting pure mathematicsresearch as well as mathematicseducation research, and education research in general. As a mathematics professor, I'm awarded the opportunity to further advance my own research and to affect and support current students who demonstrate great interest and potential for a promising career in the field of mathematics, as well as other tangential STEM disciplines.

Research

Data Scienc

e

Page 20: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

19

Addendum

Exploration of GNI Descriptive Statistics

GDP is a measure of a country's economic output. A value comprised of all goods and services

physically within the nation's boarders; it measures the strength of local income relative to the entire planet.

GNI, although it includes GDP, also includes income obtained from Americans in other nations. It measures

the profitable strength of the people. As education is a public service to which much money is funded, I

deemed the most apt way to measure a country’s economic strength and pit it against the other indicators. It

is more ideal to use a measured value of the strength of a country’s citizens than that of a “standard of

living.”

Figures 11.A – C

Gardner and Flek Data Analysis Human Development: Income and Education

Page 21: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

20

The GNI analysis sample contained 187 countries and had a mean of $12782.21 per capita. Almost 70 %

of countries have a GNI between the $265 and $28242.92. The mean is best represented by the countries

Mauritius, Botswana and Lebanon. With a larger standard deviation than the mean, of about $15460.71, one

could ascertain that a majority of the data is on the left/lower end of any graph. The heavy right skew with

around 74.5% of the world living with a GNI under that of the United States nuclear family poverty threshold

(DHHS, 2011). The box plots support this claim with approximately 75% of the space being occupied by

distance between Q3 and Max. The difference between the Min. country Liberia, and that of the Max country,

Qatar was $107456. The coefficient of variation is around 120%, which is much higher than that of any other

indicator. A simple explanation for this though is the counting system by which the values are measured. If we

examine the range, which is large considering there are over 200 points of data. With cents (1/100) as a

factor, there are over 10million possible values for each point of data. Despite which ever trends do appear

there would be a substantial degree of variance. The sizeable coefficient of variation supports the range in the

assertion that the data points are very distant from one another in these graphs, confirmed by every country

over $40000 per capita existing as an outlier (or a value so distant from the mean that it is statistically

irrelevant). The modified box plot shows the 10 points over 40000 as outliers. The world’s socio economic

differences are apparent here as one could easily locate a cut off for wealthy countries. The question persists,

is the education solely there as well.

Gardner and Flek Data Analysis Human Development: Income and Education

Page 22: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

21

Finding a correlation between earning power and education – GNI Education Indicators:

GNI v. Combined gross enrolment in education (both sexes) (%)

The correlation analysis of the GNI and gross

enrollment (Top left of section) in education results were

unexpected. With the smallest sample of only 85 countries,

the relationship between the two is moderate at best. A

positive linear relationship with a correlation coefficient of

0.422; which, considering the higher coefficients gross

enrollment has had previously in the study, is quite low.

With a correlation coefficient outside of the critical r range

of 0.213, the data provided enough evidence for the

correlation to be valid.Figure 12.

GNI v. Expected Years of Schooling (of children) (years)

The GNI is slightly more closely related to the expected

years of schooling (top right of section) with a correlation

coefficient of 0.55, and is well above the critical r of 0.144.

The sample size in this situation was much larger, 187

countries. The regression line in this case is both linear and

positive but, when observing the points on the graph it would

seem more accurate to call the shape hyperbolic with a

plateau, as the lower points increase what seemed to be

exponentially until about x=15000.

Figure 13.

Gardner and Flek Data Analysis Human Development: Income and Education

Page 23: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

22

GNI v. Mean years of schooling (of adults) (years)

The GNI relationship with mean years of schooling

(bottom left of section) is very similar to the last 2

indicators. As with the previous two the shape is called

linear but somewhat hyperbolic. The regression line

displayed has a correlational coefficient of 0.527, well above

the critical r of 0.144.and the sample size of this data set is

quite large at 189 countries. Thus far GNI has been

moderately related to every indicator.

Figure 14.

GNI v. Public expenditure on education (% of GDP) (%)Public expenditure on education (bottom right of

section) seems to have absolutely no relationship with

GNI. With an ample sample size of 100 countries the

correlation coefficient came to be a completely

negligible 0.040 well within the bound of the critical r of

0.2 meaning the correlation could be completely random

and that there isn’t enough data to make even that

minute correlation valid.

Figure 15.

Gardner and Flek Data Analysis Human Development: Income and Education

Page 24: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

23

GNI and Education Indicators Correlation Analysis within a Selected Group:

[Group includes: Top HDI rank, Top HDI movers (%), Top HDI movers (value), Top

Education index (EI) rank, Top EI movers (%), and Top EI (value)]

GNI within the selected group was far different than that of the GNI of the world. In this case, the mean

was 16891.44 per capita, with a standard deviation of $15216.02. Through this information alone one learns

that the histogram representing this data would be skewed to the right. The coefficient of 96.00% outlines

how spread the data actually is as everything over about

$35000 per capita would be considered an outlier. The country whose citizens made the least is Burundi

at $356 per capita. The country whose citizens make the most is the United Arab Emirates, at $52435 per

capita. The histogram is left with a huge skew

of 0.372

GNI and Gross enrollment within this group have a sample size of only 22 countries and a correlation

coefficient of 0.585. The shape of the regression is linear positive and the strength of the correlation itself is

moderate. The GNI and Expected Years of education within this group share a sample size of 35 countries

with a stronger correlation coefficient of 0.67. The regression is linear positive and moderate to strong. The

GNI and Mean has a sample size 35 as well. The correlation coefficient is the highest at 0.778, and the

regression is linear positive and strong. GNI and Public expenditure on education has a sample size of 22 and

has a surprising correlation coefficient of -0.26, however, it is within the bound of the critical r and as such

there is not enough evidence to maintain the only negative linear correlation of the group.

Gardner and Flek Data Analysis Human Development: Income and Education

Page 25: Hdr data project reginald gardner ppt show

Human Development in Terms of Education and Income

24

Conclusion:

It can be seen through this testing that the countries to follow are in the “top” group, as they represent

both a wide variety of countries that only have one thing in common, progress. The meaning behind this study

is not simply to find correlations and determine how the education if fairing worldwide through an objective

lens, but to peer further through that lens and suggest modifications to existing problems, which might occur

provided these facts discussed. All countries underneath today’s averages do need to rethink their strategies

to maintain themselves in a future rich in thought. Lower tiered countries on the rise must continue as a

better future is far more likely should they succeed in motivating their constituents and maintaining high

retention in their academic systems. Furthermore, it can be assessed that no country can solve their qualms

with education by simply providing more funding. The study shows that there is no realistic predictable

outcome. In conclusion, more research on the topic must be done concerning correlation, such as evaluating

individual states or provinces within countries to increase sample sizes, and for these studies to have any true

meaning an experiment must be conducted in association, prediction, exclusion of alternatives, and dose

dependence to prove causality.

References

Dodge, Y. (Editor) (2006). The oxford dictionary of statistical terms. (6 Ed.). Oxford: Oxford University Press.

Federal Register, Vol. 77, No. 17, January 26, 2012, pp. 4034-4035. Retrieved from

http://aspe.hhs.gov/poverty/12poverty.shtml#thresholds

Jeni, K. et al. (2011). Human Development Report 2011. Retrieved from United Nations

Development Programme at http://hdr.undp.org/en/reports/global/hdr2011/download

United Nations Development Programme: Human Development Reports. Data. (2011) Retrieved from

http://hdr.undp.org/en/data

Gardner and Flek Data Analysis Human Development: Income and Education

Page 26: Hdr data project reginald gardner ppt show