lecture 13 finalpubdocs.worldbank.org/en/940961593193977328/lecture-13... · lecture 13 1 1...

1

Training

Measuring inequalityLECTURE 13

1

1

Training

Outline for final lectures

§ Once datasets have been finalized, it is time to produce results, with the aim of representing the patterns emerging from the data.

§ In practice?

§ Inequality

§ Poverty

§ Basic summary statistics on household demographics, education, access to services, etc.

§ Average expenditures and incomes2

this lecture

next lecture

final lecture

2

Training

Inequality and poverty measurement

1) a measure of living standards

2) high-quality data on households’ living standards

3) a distribution of living standards (inequality)

4) a critical level (a poverty line) below which individuals are classified as “poor”

5) one or more poverty measures

living standard

persons

3

3

2

Training

Cowell (2011)

99.9% of this lecture is explained with better words in Cowell’s work: this book and other (countless) journal articles

4

Training

Warning

§ During the course we stressed the distinction between the concepts of living standard, income, expenditure, consumption, etc.

§ In this lecture we make an exception, and use these terms interchangeably

§ Similarly, I will not make a distinction between income per household, per capita, or per adult equivalent

§ For once, and for today only, we will be (occasionally) inconsistent

5

5

Training

Focus on the term 'inequality'

§ “When we say income inequality, we mean simply differences in income, without regard to their desirability as a system of reward or undesirability as a scheme running counter to some ideal of equality” (Kuznets 1953: xxvii)

§ In practice, how can we appraise the inequality of a given income distribution? Three main options:

① Tables

② Graphs

③ Summary statistics

8

3

Training

Tables: an assessment

§ In general, tables are not recommended when the focus is inequality

§Difficult to get a clue of the extent of inequality in the distribution by looking at a table.

§ To illustrate, let us look at the the distribution of income and taxable income in South Africa.

9

9

Training

Source: 2018 Tax statistics, National Treasury and the South African Revenue Service

Can you tell how high or low inequality is?

10

Training

Another candidate: graphs

Do graphs (diagrams) helprepresent and understand inequality?

11

11

4

Training

Histograms

§ Let the interval [𝑥#, 𝑥%] denote the range of the data.

§ Partition [𝑥#, 𝑥%] into 𝑚∗ non-overlapping bins (intervals) of equal width h = (𝑥% −𝑥#)/𝑚∗.

§ A histogram estimate of the density 𝑓(𝑥) is the fraction of observations falling in the bin containing 𝑥, divided by the bin width h:

0𝑓 𝑥 =(fraction of sample obs. in same bin as x)

ℎ

§ The area of each bar (= ℎ× 0𝑓 𝑥 ) is interpreted as the fraction of sample observations within the bin. All bar areas sum up to unity.

12

12

Training

Histograms?Note: extreme values (top 1%) have been removed

0

5.0e-05

1.0e-04

1.5e-04

2.0e-04

Den

sity

0 5,000 10,000 15,000 20,000Per capita expenditure

(2015 Rupees/person/month)

10 bins

0

1.0e-04

2.0e-04

3.0e-04

Den

sity



80 bins

0

1.0e-04

2.0e-04

3.0e-04

Den

sity



240 bins10

bins80

bins240 bins

13

Training

Histograms: an assessment

§ The position and number of bins is arbitrary

§ Inherently lumpy: discontinuities at the edge of each bin

§ Can provide very different pictures of the same distribution

§ Look of graphs highly dependent on treatment of extreme values

§ Read Cowell, Jenkins and Litchfield (1996) for more.

14

5

Training

Beyond histograms

§ A probability density function (PDF) is the ‘continuous version’ of a histogram

§ A convenient way to introduce the PDF is by starting from the cumulative distribution function (CDF)

15

15

Training

The cumulative distribution function (CDF)

§ The cumulative distribution function (CDF) of a random variable 𝑋 is defined as follows:

𝐹 𝑥∗ = EF

G∗

𝑓 𝑥 𝑑𝑥

§ 𝐹 𝑥∗ is the proportion of individuals having 𝑋 less than or equal to 𝑥∗ .

§ If 𝑋 is income and, say, 𝑥∗ = 2,000 Rps., then 𝐹 𝑥∗ = Pr(𝑋 < 2,000), that is the fraction of people with less than 2,000 Rps.

16

Training

The empirical cumulative distribution function (ECDF)Iraq IHSES 2007/2012., Cumulative distribution of welfare aggregate (p.21)

§ Pick any income level on the 𝑥 -axis, and the curve 𝐹 𝑥 will tell you the percentage of individuals in the population having a level of income lower than 𝑥.

§ The CDF is more effective in telling you about the incidence of poverty than about inequality.

17

IRAQ, The Unfulfilled Promise of O il and GrowthPoverty, Inclusion and Welfare in Iraq, 2007-2012

Volume 1: Main Report

17

6

Training

The probability density function (PDF)

§ The probability distribution function (PDF) is the derivative of the CDF:

𝑓 𝑥 =𝑑𝐹(𝑥)𝑑𝑥

§ Simply, the pdf describes the likelihood that a random variable 𝑋 takes on a given value.

§ Note: for continuous random variables, the probability that 𝑋 takes on any particular value 𝑥 is 0, the probability being defined only over (a,b) intervals.

18

Training

Iraq, 2007/12PDF of welfare aggregate (p.21)

§Does the PDF work any better than the CDF when describing inequality?

§ Probably, yes.What do you think?

IRAQ, The Unfulfilled Promise of O il and GrowthPoverty, Inclusion and Welfare in Iraq, 2007-2012

Volume 1: Main Report

21

Training

PDFs: an assessment

§ Similar drawbacks as histograms

§ The bandwidth (how “smooth” the graph is) is arbitrary

§ In most cases, PDFs require some trimming of top values to avoid looking “squished” and being unreadable

§ In general, the PDF is not very effective in showing what is going on in the upper tail, but that is important information when focusing on inequality

23

7

Training

The quantile function

§ Let p = F(x) be the proportion of people in the population with income lower than x.

§ The quantile function Q(p) is defined as:

§ Q(p) is the income level below which we find a proportion p of the population.

F Q p( )!" #$= p or Q p( ) = F−1 p( )inverse cdfcdf

Source: Haughton and Khandker (2009).

Pen’s Parade (Quantile Function) for Expenditure per Capita, Vietnam, 1998

24

Training

0

.2

.4

.6

.8

1

Prop

ortio

n of

tota

l exp

endi

ture

(cum

ulat

ive,

%)

0 .2 .4 .6 .8 1Proportion of population (cumulative, %)

Equality

2015

The Lorenz Curve (1905)Picture and intuition

§ Horizontal axis: cumulative % of population(individuals ordered poorest to the richest)

§ Vertical axis: cumulative % of income received by each cumulative % of population.

§ 45-degree line: Lorenz curve if perfect equality.

§ The overall distance between the 45-degree line and the Lorenz curve is indicative of the amount of inequality present in the population.

26

Training

The Lorenz Curve (1905)Mathematically

§ The Lorenz curve L(p) is defined as follows:

𝐿 𝑝 = ∫PQ R S TU

∫PV R S TU

§ The numerator sums the incomes of the poorest p% of the population;

§ The denominator sums the incomes of all.

§ The ratio L(p) indicates the cumulative % of total income held by a cumulative proportion p of the population.

§ Example: if L(0.5) = 0.3, then we know that the 50% poorest individuals hold 30%of the total income in the population.

27

8

Training

Quantile function and Lorenz curve: an assessment

§ These graphical tools emphasize the ranking of shares of the population on the basis of income

§ The Lorenz curve clearly shows how far the distribution is from perfect equality

§ Still, no graph is as straightforward and easily comparable as a scalar measure of inequality

28

28

Training

Recap and next steps

§ Not all graphs are OK to represent inequality

§ Lorenz curve is the most popular

§ A better conceptual understanding comes from constructing inequality measures from first principles.

§ The most straightforward approach: inequality measures as pure statistical measures of dispersion.

29

Training

Inequality indicators

30

30

9

Training

Measures of dispersion

§ Range R = xXYZ − xX[\

§ What do you think?

§ Variance σ^ = _\∑[a_

\ (x[ − µ)^

§ Coefficient of Variation 𝐶𝑉 = efg

§ …

31

Training

Quantiles, Quintiles, Quartiles, …

§ Take a group of observations for variable 𝑥 (e.g. 𝑥 = expenditure)

§ Order observations from smallest 𝑥 to largest 𝑥 (poor to rich)

§ Partition the set of observations into 𝑛 subsets of equal size

§ 𝑄_, 𝑄^, … , 𝑄j#_ are values of 𝑥 that identify the cut points between the 𝑛 subsamples, and are called quantiles.

§ Some quantiles have special names

32

Training

Example with n = 4

25% of data 25% of data 25% of data 25% of data

lowest x highest x

𝑄_ 𝑄^ 𝑄k

also called first quartile

or p25

also called second quartile

or p50or median

also called third quartile

or p75

Note that p indicates percentiles, that is, quantiles when n = 100

33

10

Training

Quantile ratios

§ A quantile ratio measures the gap between the rich and the poor.

§ It is defined as the ratio of two quantiles, Q(p2)=Q(p1) using percentiles p1and p2.

§ Three popular indices are:

§ The quintile ratio (p2 = 80 and p1=20):

QR = Q(p80)/Q(p20)

§ the decile ratio (p2 = 90 and p1=10):

DR = Q(p90)/Q(p10)

34

Training

The decile ratioper adult equivalent income (OECD def.)

2.9

25.6

6.34.9

4.2

35

Training

Quantile share ratios

§ Let S20 denote the share of (equivalised disposable) income received by the bottom 20% of the population, and S80 the income share received by the top 20% of the population.

§ The quintile share ratio is defined as follows:S80-20 = S80/S20

§ The quintile share ratio is the level-1 Laeken indicator, chosen by the EU to monitor income distribution.

36

11

Training

QSR around the world

37Source: (WDI) Income share held by highest 20% over Income share held by lowest 20%, last available 2010-2017

37

Training

S80/S20 in Sub-Saharan Africa

38

010

2030

Burkina

Faso

Maurita

nia

Sierra

Leon

eNige

r

Guinea

Gambia

, The

Liberi

a

Tanza

nia

Burund

i

Ethiop

iaGab

onKen

ya

Seneg

al

Malawi

Ugand

a

Cote d'I

voire

Rwanda

Zimba

bwe

Congo

, Dem

. Rep

.Tog

oCha

dGha

na

Camero

on

Guinea

-Bissau

Congo

, Rep

.

Mozam

bique

Botswan

aBen

in

Leso

tho

Zambia

Namibia

South

Africa

28

5

Source: (WDI) Income share held by highest 20% over Income share held by lowest 20%, last available 2010-2017

38

Training

The Gini CoefficientA definition

§ Yitzhaki (1997) counts more than a dozen formulas available for the Gini index.

§ A classic definition of the Gini coefficient:

§ The Gini coefficient ranges from 0 (all recipients have the same income: maximum equality), to 100 (all income is received by one recipient: maximum inequality).

G =1

2n2µxi − x j

j=1

n

∑i=1

n

∑

39

12

Training

The Gini CoefficientInterpretation – Farris 2010: 857

§ Consider the following experiment.

§ Pick two households at random and record the lower of their two incomes: call the result 𝑦 . Let µ denote the mean income. It turns out that:

§m

g= 1 − 𝐺

§ Assuming that the Gini index is 40%, we can conclude that the lower of two family incomes, chosen at random, is about 60% (=100-40%) of the national mean: on average, the poorest of any two families earns only about 60% of the national mean.

40

Training

The Gini CoefficientGraphical interpretation

§ The Gini index is two times the area A between the Lorenz curve and the equality diagonal:

𝐺𝑖𝑛𝑖 =𝐴

(𝐴 + 𝐵)= 2𝐴

= 2 _^− 𝐵 = 1 − 2𝐵

41

Training

Gini around the world

42Source: (WDI), GINI index (World Bank estimate) last available year 2010-2017

42

13

Training

Gini in Sub-Saharan Africa

43

020

4060

Maurita

nia

Guinea

Sierra

Leon

eNige

r

Liberi

a

Burkina

Faso

Gambia

, The

Tanza

niaGab

on

Burund

i

Ethiop

ia

Seneg

al

Kenya

Cote d'I

voire

Congo

, Dem

. Rep

.

Ugand

aTog

o

Zimba

bwe

ChadGha

na

Rwanda

Malawi

Camero

onBen

in

Congo

, Rep

.

Guinea

-Bissau

Botswan

a

Mozam

bique

Leso

tho

Zambia

Namibia

South

Africa

63%

33%

Source: (WDI), GINI index (World Bank estimate) last available year 2010-2017

43

Training

Gini around the WorldA selection of countries

44Source: (WDI), GINI index (World Bank estimate) last available year 2010-2017

Min

Max

44

Training

Recap

§Quantile ratios, quantile share ratios, Gini, are all popular inequality measures

§ They do a fine job at representing inequality with a number

§ Problemthey do not always have all the properties that we would want for an inequality measure

§ Solutionsolve the problem backwards. First lay out some desirable properties, then construct a measure that complies with them

46

46

14

Training

Deriving inequality measures from axioms

§ Axiom: a statement accepted as true as the basis for argument or inference.

§ The axiomatic approach allows us to “custom-build” inequality measures that fit our needs:

1. We define a set of elementary properties (axioms) that we think inequality measures ought to have

2. We obtain a mathematical formula that delivers a class of inequality measures satisfying the axioms

47

Training

Five axioms of inequality measures

A. Anonymity (or Symmetry)Who is earning the income does not matter

B. The Population PrinciplePopulation size does not matter

C. Scale Invariance (or Relative Income Principle)Income levels do not matter

D. The (Pigou-Dalton) Principle of TransfersRank-preserving rich-to-poor transfers reduce inequality

E. Decomposability (or Subgroup Consistency)The measure is additively decomposable

48

48

Training

Generalized Entropy Indices (GEI)Shorrocks (1980)

§ Inequality measures that satisfy all axioms (A to E), must have the following form:

§𝐺𝐸 𝜃 = _vf#v

_j∑wa_j Gx

G

v− 1

§where 𝜃 is a parameter that may be given any value (positive, zero or negative).

§Depending on the value assigned to 𝜃, different indices are obtained.

53

15

Training

The family of Generalized Entropy Indices

§ If 𝜃 = 0Mean Logarithmic Deviation: 𝐺𝐸 0 = 𝑀𝐿𝐷 = _

j∑wa_j log G

Gx

§ If 𝜃 = 1Theil Index: 𝐺𝐸 1 = 𝑇𝐻𝐸𝐼𝐿 = _

j∑wa_j Gx

G logGxG

§ If 𝜃 = 2Half Coeff. of Variation Squared: 𝐺𝐸 2 = 𝒦f

^

§ The case of Uganda illustrates

54

Training

UgandaNational Household Survey 2012/2013

55

55

Training

§ Inequality decompositions are typically used to estimate the extent to which the heterogeneity of the population affects overall inequality. Two popular techniques are:1. Decomposition by population sub-group

2. Decomposition by income source

§ We focus on the former:§ Societies can often be partitioned into groups (e.g. North-South). We would like to be able to

decompose total inequality into two components, namely the inequality within the constituent groups, and inequality between the groups:

𝐼�� = 𝐼�� + 𝐼��

Inequality decomposition

56

16

Training

Rwanda 2016/17Fifth Integrated Household Living Conditions Survey, EICV5 (2016/17)

58

58

Training

Lessons learned

61

§Many ways to describe inequality, some more effective than others

§Graphs: most notable are quantile functions and Lorenz curves

§Measures: different inequality measures lead to different results

§ The Gini index is an extremely popular measure - we recommend its use.

§ The GEI (generalized entropy indices), and in particular the MLD(mean log deviation), are recommended because of their theoretical properties.

61

Training

ReferencesRequired readings

Cowell, F. (2011). Measuring inequality. OxfordUniversity Press. (Chapter 1 & 2)

Suggested readingsAtkinson, A. B. (1970). On the measurement of inequality.Journal of economic theory, 2(3), 244-263.Cowell, F., Jenkins, S. P., and Litchfield, J. A. (1996). Thechanging shape of the UK income distribution: kernel densityestimates (pp. 49-75). Cambridge University Press.Farris, F. A. (2010). The Gini index and measures ofinequality. The American Mathematical Monthly, 117(10),851-864.Haughton and Khandker (2009). Handbook on Povertyand Inequality, Chapter 6.

Pen, J. (1971). Income Distribution: facts, theories, policies.Praeger.Pyatt, G. (1976). On the interpretation and disaggregation ofGini coefficients. The Economic Journal, 86(342), 243-255.Shorrocks, A. F. (1980). The class of additively decomposableinequality measures. Econometrica: Journal of theEconometric Society, 613-625.

62

17

Training

Thank you for your attention

63

63

Training

Homework

64

64

Training

Exercise 1 – Engaging with the literature

Considering equations (10) to (12) in Farris (2010) give a brief interpretation of a Gini index of 63% for South Africa

65

18

Training

Exercise 2 - Inequality in South Asia

§ Turn to page 2 of this report (see next slide)

§What criticisms would you make to this chart?

67

Training

Note: Orange and light brown bars indicate countries where inequality is estimated based on consumption per capita. Light blue bars indicate countries with estimates based on income per capita.

68

Training

Exercise 3 - Functional vs Personal distribution of income

§ The nature of the relationship that links the evolution of income shares to income inequality is complex and still widely debated among researchers.

§ In that context, comment on Figure 19 of the ILO Global Wage Report 2016/2017.

69

69

lecture 13 finalpubdocs.worldbank.org/en/940961593193977328/lecture-13... · lecture 13 1 1...

Documents