gini index and income inequality

7
0 Executive summary The central theme of this paper is to understand which factors might be related to income inequality. Gini index was set as response variable and several explanatory variables were examined. At last, only four of explanatory variables were used in the multiple regression to model Gini index across courtiers. Introduction The Gini index is a measurement of the income distribution of a country's residents. This number, which ranges between 0 and 100 and is based on residents' net income, helps define the gap between the rich and the poor, with 0 representing perfect equality and 100 representing perfect inequality. Research paper has shown that Gini index is significantly associated with macroeconomic factors such as growth rate, income level, and investment rate. This paper looks beyond macroeconomic factors, but considers indicators such as education, urban population, unemployment rate and so on. Based on cross-country data, the goal is to find what explanatory factors are highly related to income inequality and how those factors influence the unfairness. Data The year 2010 Gini index across country was downloaded from The World Bank and data form 2011 or 2009 was treated as 2010’s in order to have more observations. At beginning, I started with setting “lending interest rate” and “expenditure on education as percentage of GDP” as explanatory variables. Although data was from the same year 2010, the data shared by the same countries are so few that I had to give up these two explanatory variables because they made observations too small. Then I found putting the following five explanatory variables together with Gini index could end up with 84 observations. The variables “Population” and “unemployment” were removed after fitting the model because their p-value are too big which indicate they are not statistically important. Data description

Upload: duanrui-shi

Post on 16-Jan-2017

33 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Gini index and income inequality

0

Executive summary

The central theme of this paper is to understand which factors might be related to

income inequality. Gini index was set as response variable and several explanatory

variables were examined. At last, only four of explanatory variables were used in the

multiple regression to model Gini index across courtiers.

Introduction

The Gini index is a measurement of the income distribution of a country's residents.

This number, which ranges between 0 and 100 and is based on residents' net

income, helps define the gap between the rich and the poor, with 0 representing

perfect equality and 100 representing perfect inequality. Research paper has shown

that Gini index is significantly associated with macroeconomic factors such as growth

rate, income level, and investment rate. This paper looks beyond macroeconomic

factors, but considers indicators such as education, urban population, unemployment

rate and so on. Based on cross-country data, the goal is to find what explanatory

factors are highly related to income inequality and how those factors influence the

unfairness.

Data

The year 2010 Gini index across country was downloaded from The World Bank and

data form 2011 or 2009 was treated as 2010’s in order to have more observations. At

beginning, I started with setting “lending interest rate” and “expenditure on education

as percentage of GDP” as explanatory variables. Although data was from the same

year 2010, the data shared by the same countries are so few that I had to give up

these two explanatory variables because they made observations too small. Then I

found putting the following five explanatory variables together with Gini index could

end up with 84 observations. The variables “Population” and “unemployment” were

removed after fitting the model because their p-value are too big which indicate they

are not statistically important.

Data description

Page 2: Gini index and income inequality

1

Explanatory variables Description

GDP per capita Gross domestic product divided by midyear population

Population Total population, which counts all residents regardless of legal status or citizenship

Unemployment The share of the labor force that is without work but available for and seeking employment

Urban population The percentage of people who live in urban area

Enrollment ratio Percentage of total enrollment in tertiary education (ISCED 5 to 8), regardless of age

Model 1 summary

Methods Model 2 summary

Page 3: Gini index and income inequality

2

34.85% of variation in the response variable that can be explained by the

explanatory variables, which is not high but acceptable. All the explanatory

variables are significant.

To compare between models 1 and 2, we can do a partial F-test:

H0 : βpopulation= βunemployment= 0

Ha : at least one slope is not 0

The F statistic is 0.1524 with 80 and 78 degrees of freedom. The p-value of 0.8589 is

far larger than α=0.05. Therefore we accept the null hypothesis and conclude that

employment and population are not significant. If we look at the individual t-tests for

slopes in the summary of the larger model, we see that two variables are not

significant. Therefore, the smaller model is better.

Page 4: Gini index and income inequality

3

Based on the scatter plot above, we can see percentage of urban population has a

positive linear relationship with enrolment ratio of high-level education. This indicates

that high percentage of urban population is correlated with high enrolment ratio of

high-level education. GDP has curved relationship with percentage of urban

population and enrolment ratio. Gini index has negative linear relationship with GDP

per capita, enrolment ratio and percentage of urban population. The correlation

coefficients between Gini index and the three explanatory variables are all negative.

In addition, VIF values were computed for each explanatory variable in model and

they are all not big so multicollinearity is not a problem for this model.

Page 5: Gini index and income inequality

4

In the normal residual plots for “GDP per capita”, variability of residuals

increases as x value increases, residuals heteroscedastic. There is cloud of

points in residuals plots for urban population and enrolment ratio, residuals

homoscedastic. In the normal Q-Q plot, the points do not form a line, which

indicates that the assumption of normality of residuals is not satisfied.

Results

Page 6: Gini index and income inequality

5

The estimated regression:

𝑦𝐺𝑖𝑛𝑖 = (𝐺𝑖𝑛𝑖𝑖𝑛𝑑𝑒𝑥)0.355− (𝐺𝑖𝑛𝑖𝑖𝑛𝑑𝑒𝑥/𝑑𝑜𝑙𝑙𝑎𝑟𝑠)0.000149×𝑥𝐺𝐷𝑃𝑝𝑒𝑟𝑐𝑎𝑝𝑖𝑡𝑎+ (𝐺𝑖𝑛𝑖𝑖𝑛𝑑𝑒𝑥/%)0.227×𝑥𝑢𝑟𝑏𝑎𝑛𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛− (𝐺𝑖𝑛𝑖𝑖𝑛𝑑𝑒𝑥/%)0.2117×𝑥𝑒𝑛𝑟𝑜𝑙𝑚𝑒𝑛𝑡𝑟𝑎𝑡𝑖𝑜

Interpretation of y-intercept and slopes:

• Y-intercept: when GDP per capita, percentage of urban population and

enrolment ratio are equal to 0, the Gini index should be 0.355. In our

data, values of the three explanatory variables are far above 0, so it is

extrapolation. In practice, GDP per capita, percentage of urban

population and enrolment cannot be 0 so y- intercept does not make

sense.

• Partial slope for GDP per capita: holding other explanatory variables

constant, one dollar increase in GDP per capita is associated with

0.000149 decrease in Gini index on average.

• Partial slope for percentage of urban population: holding other

explanatory variables constant, 1% increase in the percentage of urban

population is associated with 0.227 increase in Gini index on average.

• Partial slope for enrolment ratio: holding other explanatory variables

constant, 1% increase in the percentage of high-lever education

enrolment is associated with 0.2117 decrease in Gini index on

average.

All three explanatory variables are statistically significant from zero, and

should be kept in the model. Improvement is still necessary, for example,

variance stabilizing transformation may be needed for the explanatory

variable “GDP per capita”. Alternatively, log() can be used to transform

“GDP per capita” to make its histogram look more symmetric.

Page 7: Gini index and income inequality

6

References

Worldbank. (2016). GINI index (World Bank estimate). [online] Available at: http://data.worldbank.org/indicator/SI.POV.GINI [Accessed 14 Apr. 2016]. Sarel, M. (1997). How Macroeconomic Factors Affect Income Distribution: The Cross-Country Evidence. IMF Working Papers, 97(152), p.1.