unimi.it - looking in your partner s p b income...

DIPARTIMENTO DI ECONOMIA, MANAGEMENT E METODI QUANTITATIVI

Via Conservatorio 7 20122 Milano

tel. ++39 02 503 21501 (21522) - fax ++39 02 503 21450 (21505) http://www.economia.unimi.it

E Mail: [email protected]

LOOKING IN YOUR PARTNER’S POCKET BEFORE SAYING “YES!" INCOME ASSORTATIVE MATING AND INEQUALITY

CARLO V. FIORIO STEFANO VERZILLO

Working Paper 2/2018

FEBRUARY 2018

FRANCESCO GUALA

Working Paper n. 2011-18

SETTEMBRE 2011

ARE PREFERENCES FOR REAL?

CHOICE THEORY, FOLK PSYCHOLOGY,

AND THE HARD CASE FOR COMMONSENSIBLE REALISM

FRANCESCO GUALA

Working Paper n. 2011-18

SETTEMBRE 2011

Looking in your partner’s pocket

before saying “Yes!"

Income assortative mating and inequality

�

Carlo V. Fiorio

†; and Stefano Verzillo

‡

February 19, 2018

AbstractIncome assortative mating has seldom been investigated in the lit-

erature, mostly because of endogeneity concerns related to simultaneityand omitted variable biases. Using the tax records of a major region inItaly for 2007 to 2011, we first show that income assortative mating ispresent mostly at very high income levels. In comparison with the me-dian woman, women belonging to the top 1% of their income distributionare about 25 times more likely to get married to men belonging to the top1% of their income distribution. Second, we show that, even when dealingwith simultaneity and omitted variable biases, assortative mating remainssignificantly larger for very high-earning couples. Finally, we assess thee�ect of income assortative mating on inequality by assuming randompairing. Our results are consistent with previous results showing that thee�ects are limited on income inequality as measured by a summary statis-tic, such as the Gini index. However, by exploiting the large size of ouradministrative dataset, we show that, when a bivariate partition of thepopulation is considered (e.g. couples with both spouses in the top 1%of their gender-specific distribution), the e�ect of assortative mating oninequality is huge, as, for the average counterfactual income determinedby assuming random pairing, the top income share would be reduced bymore than 80%.

�We would like to thank Joseph Altonji, Erich Battistin, Massimiliano Bratti, DanieleChecchi, Pierre-André Chiappori, Gianni De Fraja, Nico Pestel, Giovanni Pica, Erik Plug,Zhenchao Qian, Enrico Rettore, Frédéric Robert-Nicoud, Emmanuel Saez, Christine Schwartzand Daniel Waldenström for helpful comments and suggestions. Earlier versions were pre-sented at the SASE (Berkeley, 2016), ECINEQ (New York, 2017), IIPF (Tokyo, 2017), EALE(St. Gallen, 2017) conferences and seminars at UNIL (Lausanne), European Commission-JRC (Ispra), Univ. of Milan, Bocconi University and IRVAPP (Trento). The information andviews set out in this paper are those of the authors and do not reflect a policy position of theEuropean Union. Neither the European Commission nor any person acting on behalf of theCommission may be held responsible for the use that may be made of this publication.

†University of Milan and Irvapp-FBK. Address: Department of Economics, Managementand Quantitative Methods, Via Conservatorio 7, 20122 Milan; email: [email protected]

‡European Commission, Joint Research Centre (JRC) and the Interuniversity ResearchCentre on Public Services (CRISP), University of Milano-Bicocca.

1

JEL codes: J12,J41Keywords: assortative mating, administrative data, inequality, income

shares.

1 Introduction

The theoretical literature on assortative mating in the marriage market has de-

veloped extensively since the seminal paper by Becker (1973), predicting perfect

assortativeness (for a comprehensive survey, see Chiappori and Salanié, 2016).

The sociological literature has contributed to the analysis of assortative mat-

ing (often termed "homogamy"), focusing on trends over time (e.g. Mare, 1991;

Schwartz and Mare, 2005; Gonalons-Pons and Schwartz, 2017). The analysis

of assortative mating in the sociological literature mainly uses log-linear mod-

els for contingency tables (Agresti, 2002) to provide estimates of the changing

associations between couples’ characteristics while controlling for shifts in their

marginal distributions. Usually, marginal distributions refer to a characteristic

(e.g. education) of the wife and husband and the estimated models include a

dummy variable identifying couples with the same level of that characteristic

to test whether or not "homogamy" is significantly di�erent from zero, namely

mating is di�erent from random. A significant and positive coe�cient for the

dummy variable suggests positive assortative mating, i.e. like marries like.

When choosing the characteristics to be used for assessing homogamy, nat-

ural candidates would be social class or income; however, these are likely to be

endogenous. For instance, consider a newly wed couple in which the husband

belongs to a high social class (or has a high level of income) and marries a high

social class (or high-income) woman. It may be that the husband truly belongs

to the high social class or joined it only by marrying a high social class woman

who - because of marriage - shared family and social networks, economic re-

sources and business opportunities with him. Ideally, one would like to observe

2

whether or not newly-weds who appear similar in the year of marriage or after

would have been judged similar if they had been observed some years before

marrying, when they may not have known each other or been promised to each

other.

Even the best panel survey data (e.g. the Panel Study of Income Dynamics -

PSID), which follow a man (or woman) over a long period and often before and

after he (she) got married, record information on his (her) spouse only after she

(he) entered the panel by marriage, and little information is known from before

the wedding. Typically, the literature uses the highest educational attainment,

as this is available in most surveys, which is a�ected by limited measurement

error, and - in most cases - relates to a decision made some years before getting

married. Although this is not a perfect solution, as some people finish education

after marriage and others have further motivation to complete or not complete

education because they have been promised to someone, it allows one to reduce

or at least mitigate the extent of simultaneity, and hence endogeneity, in the

measurement of assortative mating.

This paper focuses on income assortative mating and its e�ects on inequal-

ity, focusing on newly-weds. We use a novel dataset, collecting all the tax forms

for the population of Lombardy, the richest region in Italy, with a population

close to 10 million, i.e. about as large as Portugal or Sweden and twice the

size of Denmark, over the period 2007-2011. The Italian institutional setting,

requiring individual tax filing and compulsory declaration of one’s marital part-

ner, allows us to identify the marital status of residents (i.e. single or in a

couple) and the year of marriage if it is between 2008 and 2011. We focus on

newly wed and single individuals (i.e. those who were on the marriage market

before each year considered), compute the percentiles of employment income

by gender and group them in 100 income groups (percentiles). Therefore, we

3

select newly-weds specifically and, by building the 100 ◊ 100 contingency ta-

ble of male and female percentiles, we show that assessing assortative mating

by estimating that the average probability of couples lies along the main di-

agonal of the contingency table would be misleading, as assortative mating is

mostly concentrated in the top levels of income, increasing steeply over the 95th

percentile.1 The unconditional frequency of newly wed couples in which both

spouses are in the top 1% of their respective gender distributions is about 25

times larger than that of newly wed couples in which at least one of the two

spouses has a median level of income. A similar pattern of unconditional assor-

tative mating was recently observed using European Union Statistics on Income

and Living Conditions (EU-SILC)2 data for France by Frémeaux and Lefranc

(2017), who focused mostly on potential earnings without addressing the issue

of endogeneity given the data structure. We address the endogeneity issue by

exploiting the panel dimension of our dataset, which allows us to determine the

income percentile group of all newly-weds in the income distribution up to three

years before marriage and we have used the same approach adopted so far in

the educational assortative mating literature. Specifically, we assess whether or

not positive assortative mating exists when controlling for the ranking of each

newly wed spouse before marriage, assuming that, for up to three years previ-

ously, newly-weds did not know each other or had not been promised to each

other.

Our paper contributes to the assortative mating literature by providing an

estimate of assortative mating based on income, which accounts for the bias due

to simultaneity and a large number of possible confounding factors. Although

the lack of a suitable instrument prevents us from making a causal interpretation1The average newlywed husband at the 95th percentile earns e39,600, about 20% more

than the average wife in the 95th percentile.2EU-SILC is an instrument aiming to collect timely and comparable cross-sectional and

longitudinal multidimensional microdata on income, poverty, social exclusion and living con-ditions across the European Union since 2004 (https://tinyurl.com/nspxlmq).

4

of our results,3 by exploiting the balanced panel structure of our dataset for all

newly-weds, we are able to provide an estimate of the extent of the simultaneity

bias.

Our results show, first, that assortative mating at the top levels of income

remains large even when newly-weds are observed three years before the wedding

takes place and, second, that, at the top levels of income, simultaneity bias is

negligible even when dealing with time-invariant individual characteristics such

as family background or education. These results are shown for newly-weds

where both spouses are aged between 25 and 44, but robustness analyses broadly

confirm the picture and suggest that income assortative mating also increases

for older newly-weds.

The analysis of assortative mating has recently also gained significance in

the economic literature and has been investigated as a possible additional driver

of household inequality. Although inequality as measured by indices looking at

the whole distribution (such as the Gini and Theil indices, or coe�cient of vari-

ations) presents a mostly increasing trend, albeit with large variability between

countries and over time, recent literature (e.g. Atkinson, Piketty, and Saez,

2011, and the World Wealth & Income Database-WID project4) has devoted a

lot of attention to what happens in the top 1% (or above) of the income dis-

tribution, highlighting the increasing contribution of wages and salaries. The

increase in college premiums and the role of phenomena such as minimum wages

and unions are often regarded as the main factors explaining increased inequality

(e.g. see Goldin and Katz, 2007; Autor, 2014; Katz and Autor, 1999). Changing

female labour force participation and changing labour earnings distributions be-

tween households have also been analysed as factors driving inequality changes3For one of the very few contributions using an instrumental variable approach, see Barban,

De Cao, Ore�ce, and Quintana-Domeque (2016), which was unfeasible in our context for lackof data.

4The comprehensive and constantly updated dataset of the WID project can be found athttp://wid.world/.

5

(e.g. Daly and Valletta, 2006; Fiorio, 2011; Larrimore, 2014). Recent litera-

ture has addressed whether or not educational assortative mating has played a

role in explaining increased inequality, mostly by estimating the Gini index and

contrasting it with the counterfactual situation where spouses were randomly

matched (e.g. Eika, Mogstad, and Zafar, 2014; Greenwood, Guner, Kocharkov,

and Santos, 2014; Hryshko, Juhn, and McCue, 2017a; Pestel, 2017). This lit-

erature concludes that assortative mating contributes positively to household

inequality, although its contribution is minor, as the Gini index is ’only’ 10%

higher than the counterfactual situation.

Our paper’s third and final contribution to the existing literature is to show

that positive assortative mating has a strong impact on the concentration of

income. We focus on the e�ect of assortative mating on household income dis-

tribution in the population of newly-weds, assuming equal sharing of resources

between the couple, and comparing the actual distribution with the distribu-

tion of counterfactual households obtained by randomly allocating spouses 1, 000

times. Results suggest that the very large positive assortative mating e�ect ob-

served for the top levels of income has a modest inequality-increasing e�ect, not

larger than 10% of the Gini index, consistent with the results of other authors

in di�erent institutional settings, as cited above. However, we suggest that this

conclusion is driven by the use of summary measures such as the Gini index,

which are mostly sensitive to changes in the bulk of the income distribution

(Cowell, 2011). Using alternative measures, such as those from the family of

Atkinson indices with large inequality aversion coe�cients or from the fam-

ily of generalised entropy with large "–-coe�cients" (i.e. where large weight is

given to di�erences between incomes at the top of the distribution), would be

problematic, as these indices are very sensitive to outliers (Cowell and Victoria-

Feser, 1996). Along the lines of the literature initiated by Piketty and colleagues

6

(Piketty, 2001; Piketty and Saez, 2003; Atkinson, 2005), we focus on the income

of top earners and, focusing on all newly-weds aged 25 ≠ 44 earning wage or

salary (employment) income, we find that couples in which both spouses are

in the top 1% of their respective income distributions earn over 61 times more

than if income were equally distributed.

The rest of the paper is as follows. In Section 2, we describe the data and

present the selection rules adopted. In Section 3, we explain how we organised

the data to show income assortative mating and present the excessive assortative

mating distribution. In Section 4, we present the endogeneity issues for income

assortative mating and describe how we addressed them and, in the following

Section 5, we present the main results and robustness checks. In Section 6, we

compute the e�ect of assortative mating on inequality and, in Section 7, we

discuss our results.

2 The data

In this paper, we explore the full administrative dataset of tax forms for a major

Italian region (Lombardy) over the period 2007-2011, which, at present, is the

longest time span available. This is the richest region of Italy, with about 10

million residents, as large as Portugal and twice the size of Denmark. Our

dataset is based on administrative data,5 allowing the exact identification of

tax units.

Italian tax legislation requires the taxpayer to declare the tax identifier of

all fiscally dependent relatives (i.e. with gross earnings below e2, 840) and of

their spouses, regardless of their income.

As shown in Table 1, we have annual information on approximately 10 million5Data were analysed as part of a research programme between the Interuniversity Re-

search Centre on Public Services (CRISP) at the University of Milano-Bicocca and the Taxand Income Department of the Lombardy Region. Tax records were analysed having beenanonymised using an irreversible hashing algorithm.

7

residents, of whom about 40% live as legally married couples.6 Approximately

1.7% of all couples are new couples, i.e. couples that got married within that

year. In 2011, the number of residents who got married was about 63.7 thou-

sand, showing a decreasing trend over the period studied (66, 210 in 2008), for

reasons that we can only guess but might be due to increased job instability,

economic crisis or changing values. The Italian National Statistical Institute

(Istat; http://www.istat.it/), using registry data, reports very similar numbers

of residents as couples and newly-weds in Lombardy over the same period, thus

confirming the validity of the tax dataset we use.

Given our focus on assortative mating, in what follows we focus on newly

wed couples in which both spouses are between 25 and 44, which is the age

group of those who should have finished their educational cycle, are in their

best fertile years and we expect are less likely to base their choice of marital

partner on income.7 Table 1 shows that, out of a total of 31, 867 couples (63, 734

individuals) with both spouses aged 25-44 in 2011, about 75% of all new couples

in 2011 belonged to this age group. By focusing on 2011 and splitting the

population by gender, Table 2 shows that, on average, women get married at

the age of 32 and men at 34 (panel a). Labour-active newly wed women earn

an average wage or salary (employment) income that is slightly larger than that

of the average active woman in this age group, whereas newly wed men earn

nearly 20% more than the average man in the same age group (panel b). Total

annual labour earnings of the average self-employed man are close to those of

the average male employee, while the representative newly wed self-employed

woman earns on average 23, 000 euros, 3, 000 euros more than the average newly

wed female employee (panel c).6Tax form data allow the identification of legally married couples only, as taxpayers in

unmarried cohabiting couples are not allowed to claim for dependent spouse tax credit andare not obliged to declare their partner. These data also do not allow identification of secondmarriages as opposed to first marriages.

7This a priori expectation will be tested in the robustness section 5.1.

8

The proportion of self-employed people with positive income in the 25 ≠ 44

age group is limited, comprising fewer than 5% of women and about 10% of

men, with most people in this age group working as employees (Table 3).

Table 1: Total number of residents in Lombardy by year and marital status.

2008 2009 2010 2011

All ages All residents 10,466,319 9,908,071 9,996,217 9,794,060All ages In a couple 3,836,995 3,870,915 3,875,486 3,837,662All ages In a new couple 66,210 62,998 62,712 63,734Aged 25-44 All residents 3,234,338 2,949,590 2,915,903 2,750,288Both aged 25-44 In a couple 1,342,160 1,310,950 1,261,876 1,207,160Both aged 25-44 In a new couple 49,166 48,102 47,926 47,626

Notes: Individual observations. Couples are legally married couples only. New couplesare couples that were not married the year before.

9

Tabl

e2:

Som

ede

scrip

tive

stat

istic

sby

gend

erfo

rin

divi

dual

sag

ed25

-44

for

2011

.

(a)

Age

Wom

enM

enO

bsM

ean

Std.

Dev

.O

bsM

ean

Std.

Dev

.A

llre

siden

ts1,

357,

361

35.6

65.

21A

llre

siden

ts1,

392,

927

35.7

05.

35In

aco

uple

669,

745

37.0

94.

87In

aco

uple

537,

415

38.0

24.

43In

ane

wco

uple

23,8

1331

.98

4.45

Ina

new

coup

le23

,813

34.3

14.

55

(b)

Empl

oym

ent

inco

me

(wag

ean

dsa

larie

s)W

omen

Men

Obs

Mea

nSt

d.D

ev.

Obs

Mea

nSt

d.D

ev.

All

resid

ents

939,

450

18,6

95.2

044

1,49

5.50

All

resid

ents

1,09

2,87

425

,733

.93

1,44

4,90

1.00

Ina

coup

le42

8,71

918

,959

.81

16,0

35.8

5In

aco

uple

409,

860

30,7

48.7

266

,541

.22

Ina

new

coup

le17

,381

20,3

34.6

312

,368

.64

Ina

new

coup

le17

,541

30,3

22.6

698

,524

.67

(c)

Self-

empl

oym

ent

inco

me

Wom

enM

enO

bsM

ean

Std.

Dev

.O

bsM

ean

Std.

Dev

.A

llre

siden

ts52

,535

24,5

47.8

736

,383

.93

All

resid

ents

132,

426

28,9

04.1

538

,246

.40

Ina

coup

le27

,192

25,0

78.3

833

,170

.84

Ina

coup

le69

,852

31,6

03.9

442

,715

.91

Ina

new

coup

le1,

151

23,0

64.9

123

,944

.58

Ina

new

coup

le3,

029

29,2

84.0

936

,493

.21

Not

es:

Indi

vidu

alob

serv

atio

ns.

Cou

ples

are

lega

llym

arrie

dco

uple

son

ly.

New

coup

les

are

coup

les

that

were

notm

arrie

dth

eye

arbe

fore

.Em

ploy

men

t(se

lf-em

ploy

men

t)in

com

est

atist

icsa

repr

esen

ted

only

fort

hesa

mpl

eof

thos

ew

ithpo

sitiv

eem

ploy

men

t(s

elf-e

mpl

oym

ent)

inco

me.

10

Tabl

e3:

Prop

ortio

nsof

indi

vidu

als

with

zero

inco

me

byge

nder

for

indi

vidu

als

aged

25-4

4.20

11on

ly.

(a)

Prop

ortio

nof

zero

empl

oym

ent

inco

me

Wom

enM

en

Obs

Mea

nSt

d.D

ev.

Obs

Mea

nSt

d.D

ev.

All

resid

ents

1,35

7,36

10.

308

0.46

2A

llre

siden

ts1,

392,

927

0.21

50.

411

Ina

coup

le66

9,74

50.

360

0.48

0In

aco

uple

537,

415

0.23

70.

425

Ina

new

coup

le23

,813

0.27

00.

444

Ina

new

coup

le23

,813

0.26

30.

440

(b)

Prop

ortio

nof

zero

self-

empl

oym

ent

inco

me

Wom

enM

en

Obs

Mea

nSt

d.D

ev.

Obs

Mea

nSt

d.D

ev.

All

resid

ents

1,35

7,36

10.

957

0.20

3A

llre

siden

ts1,

392,

927

0.89

90.

301

Ina

coup

le66

9,74

50.

955

0.20

7In

aco

uple

537,

415

0.86

40.

343

Ina

new

coup

le23

,813

0.94

70.

224

Ina

new

coup

le23

,813

0.86

70.

340

Not

es:

Indi

vidu

alob

serv

atio

ns.

Cou

ples

are

lega

llym

arrie

dco

uple

son

ly.

New

coup

les

are

coup

les

that

were

not

mar

ried

the

year

befo

re.

11

3 The Excessive Mating Ratio

If assortative mating were not correlated with income, one would expect that

the likelihood of a rich man getting married to a poor woman would be the

same as that of a rich man getting married to a rich woman. To investigate

assortative mating based on income, out of the population of newly wed residents

we selected the set of people who in year t were either single or just married (i.e.

they were on the marriage market in year t ≠ 1), grouped them by gender and

year, ranked each of them into 100 gender-specific percentile groups (obtained

by the 99 numbered points that divide the ordered set of income into 100 parts,

each of which contains one-hundredth of the total) and built a 100 ◊ 100 matrix

of frequency counts for each year t. The cell {k, j}t identifies couples with a

wife in the k-th wives’ income percentile and the husband in the j-th husbands’

income percentile in year t, where {k, j = 1, ..., 100} and t = 2008, ..., 2011.

For instance, the cell {100, 100}2011 contains 2011 newly wed couples with both

spouses belonging to the top 1% of their gender-specific income distribution and

the cell {50, 100}2011 contains couples with a husband in the top 1% and a wife

whose income is between the 49th percentile and the median.

The observed probability that a newly wed couple will fall into the cell

{k, j}t is the number of newly wed couples in cell {k, j}t, c

tk,j , divided by the

total number of newly wed couples,q

k,j c

tk,j . This allows us to compute what

we call the Excessive Mating Ratio (EMR) for year t, which is the ratio of the

observed mating probability over the theoretical probability of mating under

the assumption of random mating:

12

EMR

tkj := Actual relative frequency of couples in cell {k, j}t

Theoretical relative frequency under random mating

=Prob(w = k, h = j|married in year t)Prob(w = k) ◊ Prob(h = j) =

c

tkj/

qk,j c

tk,j

(1/100) ◊ (1/100) . (1)

If the EMR

tkj = 1 for all k, j, where k, j = 1, 2, ..., 100, there would be no

assortative mating based on income in year t, meaning that people get married

for other reasons (e.g. love, randomness), which are not correlated with income.

When the EMR

tkj > 1, this means that the observed frequency of couples ckj

exceeds the theoretical probability, i.e. there is positive assortative mating.

When, instead, EMR

tkj < 1, it means that the observed frequency of couples

ckj is exceeded by the theoretical probability, i.e. there is negative assortative

mating.

A preliminary descriptive analysis of the level of activity and levels of earn-

ings of newly-weds, discussed in the previous section, led us to take some deci-

sions regarding data selection for couples with both spouses in the 25 ≠ 44 age

group. First, we focus on new couples formed during 2008-2011 and drop from

the analysis all records of residents who remained single during the observed

period or got married before 2008. Second, as self-employment income is, to a

large extent, self-declared and its measurement error is likely to be large, we

use employment income only, which is third-party reported with virtually no

under-reporting concerns. Similarly, capital income is excluded from the anal-

ysis because financial capital income taxation is often fully paid at source and

is therefore not declared in tax forms, and real estate and building property in-

come is measured noisily, as data do not allow us to disentangle imputed income

based on cadastral values from actual rental income. All income measures are

based on market income, i.e. before all taxes and benefits.

13

Figure 1 plots on a 3D surface the average of the EMR over the period

considered, EMRkj =q2011

t=2008 EMR

tkj/4, where only the (gross) employment

income percentile groups of both spouses above the median are shown. It shows

that a marriage between people in very distant percentile groups (e.g. a man

with median income and a woman in one of the top percentiles, and vice versa) is

relatively unlikely, whereas positive assortative mating based on labour income

is relatively more frequent. What appears as a striking pattern is the increase

of positive assortative mating as the level of income increases above the 90th

percentile. In particular, the frequency of couples with both spouses in the top

1% of each gender-specific employment income distribution is about 25 times

more likely to occur than couples in which both spouses have a median income.8

Interestingly, this pattern is a constant feature over time. Figure 2 plots the an-

nual unconditional EMR, showing the persistence of positive assortative mating

at top income levels for t = 2008, ..., 2011. It shows that the observed pattern

is consistent over time.9

Using Bayes’ rule, we can decompose the EMR in equation (1) as the product

of the probability of a woman who got married in year t to a man in percentile

group j belonging to percentile group k times the probability of a newly wed man

belonging to percentile group j, normalised by the random mating probability,

as follows:

EMR

tkj := Prob(w = k|h = j, married in year t)

Prob(w = k) ◊Prob(h = j|married in year t)Prob(h = j)

(2)8Interestingly, there are some gender di�erences in income assortative mating as the share

of top 1% newly wed males who get married to a top 10% woman is equal to 30%, whereasthe top 1% newly wed females who get married to a top 10% man is equal to 53%.

9Figures 1 and 2 show results for couples in which both spouses have at least a medianincome for clarity reasons, although analogous figures for all percentile groups can be foundin the Appendix, Subsection 8.1.

14

Figure 1: Unconditional joint income distribution by percentile group of eachspouse, employment income only, age group 25-44, both spouses above the me-dian. Mean 2008-2011.

Figure 3 plots the unconditional distribution of a newly wed man who got

married in 2011 belonging to percentile j, i.e. Prob(h = j|married in year t) for

all percentile groups, and contrasts it with the analogous distribution for newly

wed women. A value equal to 1 in a particular percentile group means that

the proportion of newly-weds in that percentile group is as large as random,

whereas a value lower than 1 means that it is relatively unlikely for newly-weds

to be in that income group, and vice versa for values above one. Focusing on

people with positive income10, it shows that the likelihood that newly wed

women have an income in the 30th percentile of the gender-specific income

distribution is about 50% lower than for women in the 60th percentile, and the

likelihood that newly wed men have incomes in about the 30th percentile is 75%

lower than for those in the 70th percentile. It also shows that the probability

increase is approximately linear for both genders up to the 80th percentile for10About 20% of newly wed men and 25% of newly wed female have zero income, though

they both have a higher likelihood to get married than newlyweds with positive but bottomincome, possibly because their potential income is larger.

15

Year

2008

Year

2009

Year

2010

Aver

age

EMR

over

2008

-201

1

Figu

re2:

Unc

ondi

tiona

ljoi

ntin

com

edi

strib

utio

nby

perc

entil

egr

oup

ofea

chsp

ouse

,em

ploy

men

tin

com

eon

ly,

age

grou

p25

-44,

both

spou

ses

abov

eth

em

edia

n,20

08-2

011.

16

women and the 70th percentile for men; at this point they diverge, increasing

for men and decreasing for women. Male newly-weds are in the top 1% of

income 2.4 times more often than random, whereas female newly-weds are only

as often as random. Male newly-weds have employment income below the 20th

percentile more often than random, although one should recall that this group

includes newly-weds who are only self-employed, as well as the group of newly-

weds who are temporarily out of employment, possibly because of temporary

unemployment or education.

The strikingly low probability of getting married for people with positive but

low levels of income is consistent with the low probability of marriage for people

with low levels of education, which has also been found in other countries (e.g.

Schwartz and Mare, 2005).

The reason for the diverging patterns of representation of di�erent employ-

ment income percentile groups among male and female newly-weds can be at-

tributed to both demand for and supply of women on the marriage market.

Bertrand, Kamenica, and Pan (2015), using US data, recently argued that gen-

der identity norms may explain the aversion to situations where the wife earns

more than her husband and this a�ects getting married, the division of home

production, marriage satisfaction and divorce rate. In a field experiment, Bursz-

tyn, Fujiwara, and Pallais (2017) show that single female students are less likely

to report their ambitions, such as desired salaries and willingness to travel and

work long hours, when observed by their potential partners. On one hand,

high-earning women might be less attractive on the marriage market, as they

could threaten men’s identities and signal that they are less inclined to compro-

mise their career in favour of their husband’s. On the other hand, high-income

women might intentionally choose not to get married, as gender identity norms

could make them less likely to continue working, unsatisfied with family life and

17

Figure 3: Frequency of newly wed spouses in 2011 by percentile group of em-ployment income and by gender three years before the wedding. Age group25-44.

��

��

��

� �� *HQGHU�VSHFLILF�FHQWLOH�JURXS

:RPHQ�� 3URE��Z� �N�_�QHZO\ZHG�0HQ�� 3URE��K� �M�_�QHZO\ZHG�

<HDU��$JH�JURXS��

likely to be working more at home.

18

4 Empirical modelling of assortative mating based

on income

In this section, we provide an estimate of the conditional probability of a woman

belonging to percentile group k given that she got married at time t to a man

belonging to percentile group j, i.e. the first factor of (2).

The aim of this empirical investigation is to assess if the strong association

between wives’ and husbands’ incomes found in the top percentile remains,

even when controlling for possible confounding factors and dealing with the

co-determination (simultaneity) of each newly-wed’s income at the time of the

wedding.

The analysis starts by computing the q-th income percentiles of the income

distribution in year t of men who were eligible for marriage (i.e. unmarried

in year t ≠ 1 ) as qh := Pr[yh < qh] Æ q/100 and, similarly for women, qw,

where {q = 1, ..., 100}. We denote as {j}h the percentile group of husbands

whose incomes are between the j-th and the (j ≠ 1)-th percentile and as I

hj,i,

which is a binary indicator that is equal to one if the husband in couple i

belongs to group {j}h (i.e. qh≠1 < yh < qh) and I

hji = 0 otherwise. Hence,

using a similar notation for wives, we provide an estimate of Prob(w = i|h =

j, married in year t) of the population of N newly wed couples in year t:

I

wi,k;t =

ÿ

j

—j,kI

hij;t + ui,k;t (3)

where i = {1, ..., N}, k = {1, ..., 100}, I

h is the vector indicating to which

quantile group w’s husbands belong to, i.e. I

hÕ = [Ih1 , ..., I

h100], —k are vector of

coe�cients to be estimated and ui,k is the unobserved error term.

Model (3) is a linear probability model, which we estimate by ordinary least

squares (OLS) for all wives’ quantile groups, k = {1, ..., 100}, although our at-

19

tention will focus on top quantiles. In fact, the estimation of {—k} in (3) is likely

to be a�ected by simultaneity bias, which would lead to an overestimation of

the true parameter. This is because the decision to get married is co-determined

between spouses, and the husband’s income ranking in the year of the wedding,

I

hij;t, is likely to correlate with the unobserved characteristics of the wife, which

are included in ui,k;t. One can think of a wedding as being decided ¸ years

before t or even earlier and, since then, the fiancés might decide to change their

behaviour (e.g. labour market participation) in preparation for the wedding.

For instance, as a wedding is often a preparation for having children, the future

spouses might exert more e�ort in advance, working harder before the wedding

to get a promotion and enjoy better prospects during maternity leave, or they

might share each other’s network of contacts, improving each other’s chances

of getting a better job. Therefore, to limit the e�ect of the simultaneity bias,

we exploited as much as possible the panel dimension of our dataset, estimat-

ing the assortative mating coe�cient by looking at each newly-wed’s position ¸

years before marriage, where very large values of ¸ would make I

s, for s = w, h,

largely correlated with the family of origin background and less correlated with

the future spouse’s own income (possibly not yet known at time ¸).

I

wi,k;t≠¸ =

ÿ

j

—j,kI

hij;t≠¸ + ui,k;t≠¸, (4)

If the simultaneity bias were zero, the estimated — coe�cients in (3) and (4)

would be the same.

A visual depiction of the average mobility of wives and husbands for the three

years before the wedding is shown in Figure 4, providing illustrative evidence

that the anticipation e�ect in preparation for the wedding is larger between the

40th and 60th percentile groups for both partners, while it is almost null for

those in the top levels of their income distributions.

20

Figu

re4:

Aver

age

mob

ility

ofin

com

eby

gend

er¸

year

sbe

fore

mar

riage

,age

grou

p25

-44.

Mar

riage

year

2011

.Sc

atte

rpo

ints

whe

n¸

=3

and

loca

llywe

ight

edre

gres

sions

for

¸=

1,2,

3.

��

<HDU�W

��

��

��

��

<HDU�W��ODJ

:LYHV

��

<HDU�W��

��

��

��

<HDU�W��ODJ

+XVEDQGV

%LVHFWRU

ODJ� ��

��

ODJ� ��

ODJ� ��

ODJ� ��

21

5 Empirical estimation of assortative mating based

on income

In this section, we present the results of estimating the likelihood that a newly

wed woman in income percentile group k gets married to a man in income

percentile group j. However, given the picture of the joint distribution shown

in Section 3, our interest mainly focuses on the subgroup —k, {for k = 1, ..., 100}

and results will be presented only for the median group and for the top percentile

groups, i.e. for k = 50, 98, 99, 100.

In what follows, we consider only members of newly wed couples, although

the income percentile group to which the newly wed spouses belong is iden-

tified as above, namely based on the population of newly wed and unmarried

individuals by gender in the same age group. As we will also provide estimates

of assortative mating when each spouse’s ranking in the income distribution is

observed up to ¸ = 3 years before the wedding, as in (4), we will focus only on

newly wed couples in 2011. Table 4 reports some summary statistics by gender

for newly-weds in the 25-44 age group. It shows that, out of the 23, 813 women

who got married in 2011, 203, 258, 281 and 274 were in the 50th, 98th, 99th

and 100th percentile groups, respectively. The average employment income of

the top 1% of newly wed wives is more than 6 times the mean income of me-

dian newly wed wives. Newly wed husbands are relatively more frequent in the

top percentile groups. We found 493, 503 and 568 newly wed husbands in the

98th, 99th and 100th percentile groups, respectively, whereas only 139 newly wed

husbands have an income at the median level for this age group. The income

of newly wed husbands is higher than that of equally ranked women and the

di�erence is very large in the top 1%, where the right-hand tail of the income

distribution is very thick.

22

The mean employment income of wives in the top percentile group is about

e78, 000 and that of husbands in the top group is more than twice that.

In the tables that follow, we used the smallest estimation sample for the

specifications (3)-(4) to allow comparison of results between the di�erent spec-

ifications and understanding of the direction of the simultaneity and omitted

variable biases.

Table 5 provides the estimates of the basic specifications (3) and of the

specification (4) with ¸ = 3 for each quantile k = {50, 98, 99, 100}. The first

two columns show the probability of a wife being in the 50th percentile group

given the husband’s percentile group, and a coe�cient that is not significantly

di�erent from zero is estimated.

The following columns suggest that the likelihood of a newly married wife

being in the top percentile group is highly correlated with the probability of her

husband being in the top percentile group. In particular, the next-to-last column

shows that the probability of a newly married wife being in the top percentile

group is 10% higher if she gets married to a top-1% husband, according to

specification (3). By dealing with the simultaneity bias, as in (4), the correlation

in the top percentile groups is reduced to 7% with a high degree of statistical

significance.

Table 6, focusing on wives in the top 1% only, allows an assessment of the

size of the simultaneity bias and the e�ect of including some controls, such

as age, number of children and municipality of residence of each spouse, the

latter consisting of about 1, 500 fixed e�ects. Comparing results derived using

no controls and with increasing lags, namely with ¸ = 0, 1, 2, 3, one can observe

that the —100,100 coe�cient is reduced from 0.097 for ¸ = 0 to 0.070 for ¸ = 3,

suggesting that the simultaneity bias acts in the expected direction but also that

it is not large enough at the top to explain the high level of assortative mating

23

observed in the unconditional joint distribution. Control variables generally

increase the explanatory power of the model and slightly reduce the magnitude

of the assortative mating coe�cient, although maintaining its high statistical

significance. The coe�cient —100,100 estimated in the last column of Table 6

suggests that a woman who got married in 2011 to a man who was in the top

1% three years earlier, at a time when they probably had not yet committed

to marriage, was about six times more likely to belong to the top 1% of her

distribution than the median woman. One should also remember that this

estimation accounts for the first part of (2) and, to obtain the unconditional

EMR, one should multiply it by about 2.5, which is the probability of a married

man belonging to the top 1%.

24

5.1 Robustness checks

Results are robust to di�erent definitions of income. Table 7 has the same

structure as Table 5, the only di�erence being that income is now defined as

the sum of self-employment income plus wages and salaries. Considering self-

employment income in addition to employment income allows us to take into

account total labour income at the expense of increasing measurement error,

as self-employment income is largely a�ected by tax-avoiding behaviours. As

expected, coe�cients are lower, as measurement error causes an estimation bias

towards zero, although they remain highly significant at the top levels of income.

Table 8 shows the robustness of results to di�erent age selections. The age

groups considered are couples in which both spouses are five years younger than

in the main regressions (i.e. 20-39), older (i.e. 30-49) or in a larger age interval

(i.e. 20-59). Focusing on the estimated coe�cient (—̂100,100), one can observe

that the probability of a top-percentile wife marrying a top-percentile husband

remains highly significant, and pointwise coe�cient estimates tend to increase

when older couples are included, consistent with our a priori expectations.

Overall, these results suggest that income is a significant factor in partner

selection, although evidence for this is found only at the very top levels of

income.

25

Tabl

e4:

Som

est

atist

ics

bya

sele

cted

grou

pof

empl

oym

ent

inco

me

perc

entil

es.

All

indi

vidu

als

are

aged

25-4

4.20

11on

ly.

Empl

oym

ent

inco

me

Wiv

esH

usba

nds

Perc

entil

egr

oup

5098

9910

0Pe

rcen

tile

grou

p50

9899

100

N.O

bs.

203

258

281

274

N.O

bs.

139

493

503

568

Mea

n11

,931

42,2

6050

,564

77,8

51M

ean

16,3

2752

,154

62,7

9216

3,22

7M

inim

um11

,700

39,9

0045

,500

57,0

00M

inim

um16

,151

48,8

8056

,280

72,1

49M

axim

um12

,100

45,5

0056

,800

omitt

edM

axim

um16

,479

56,2

5271

,988

omitt

edN

otes

:A

llm

onet

ary

figur

esar

ein

curr

ent

euro

s.M

inim

uman

dm

axim

umva

lues

have

been

roun

ded

toth

ene

area

sthu

ndre

adth

.T

hew

hole

sam

ple

incl

udes

the

23,81

3wo

men

and

23,81

3m

enw

hogo

tm

arrie

din

2011

.

26

Tabl

e5:

OLS

estim

ates

ofth

epr

obab

ility

ofa

wom

anbe

long

ing

tope

rcen

tile

Iw kge

ttin

gm

arrie

dto

am

anbe

long

ing

tope

rcen

tile

Ih j,

for

som

e,se

lect

edk

and

j,in

the

year

ofth

ew

eddi

ng(t

,i.e

.¸

=0)

and

atth

ree

year

sbe

fore

the

wed

ding

(¸=

3),w

here

tis

2011

.

Iw 50

;t≠

¸I

w 50;t

≠¸

Iw 98

;t≠

¸I

w 98;t

≠¸

Iw 99

;t≠

¸I

w 99;t

≠¸

Iw 10

0;t≠

¸I

w 100;

t≠¸

¸=

0¸

=3

¸=

0¸

=3

¸=

0¸

=3

¸=

0¸

=3

. . .. . .

. . .. . .

. . .. . .

. . .. . .

. . .I

h 50;t

≠¸

-0.0

10-0

.010

-0.0

11-0

.014

-0.0

17-0

.017

-0.0

13-0

.011

(0.0

14)

(0.0

11)

(0.0

17)

(0.0

14)

(0.0

18)

(0.0

14)

(0.0

16)

(0.0

13)

. . .. . .

. . .. . .

. . .. . .

. . .. . .

. . .I

h 98;t

≠¸

-0.0

03-0

.006

0.03

7***

0.01

5*0.

031*

**0.

037*

**0.

014*

0.00

8(0

.006

)(0

.007

)(0

.008

)(0

.008

)(0

.008

)(0

.008

)(0

.007

)(0

.008

)I

h 99;t

≠¸

-0.0

100.

001

0.03

4***

0.02

9***

0.03

9***

0.03

4***

0.04

6***

0.03

6***

(0.0

06)

(0.0

06)

(0.0

08)

(0.0

08)

(0.0

08)

(0.0

07)

(0.0

07)

(0.0

07)

Ih 10

0;t≠

¸0.

001

0.00

60.

044*

**0.

051*

**0.

064*

**0.

059*

**0.

097*

**0.

070*

**(0

.007

)(0

.007

)(0

.008

)(0

.009

)(0

.008

)(0

.009

)(0

.008

)(0

.008

)F

ixed-e�

ects:

Mun

icip

ality

No

No

No

No

No

No

No

No

N.c

hild

ren

No

No

No

No

No

No

No

No

Age

No

No

No

No

No

No

No

No

Obs

erva

tion

s11

,062

11,0

6211

,062

11,0

6211

,062

11,0

6211

,062

11,0

62R

-squ

ared

0.00

60.

006

0.01

20.

012

0.02

00.

018

0.02

90.

019

Notes:

Res

ults

arep

rese

nted

fora

sele

ctio

nof

Ih j,n

amel

yfo

rthe

med

ian

and

thet

opth

reep

erce

ntile

grou

ps.

Estim

ates

for

di�e

rent

valu

esof

jca

nbe

obta

ined

upon

requ

est

from

the

auth

ors.

Onl

yco

uple

sin

whi

chbo

thsp

ouse

sw

ere

resid

ent

inLo

mba

rdy

for

thre

eye

ars

befo

reth

ew

eddi

ngar

eco

nsid

ered

.St

anda

rder

rors

inpa

rent

hese

s;**

*p<

0.01

,**

p<0.

05,*

p<0.

1.

27

Tabl

e6:

OLS

estim

ates

ofth

epr

obab

ility

ofa

wom

anbe

long

ing

toth

eto

p1%

Iw 100

gett

ing

mar

ried

toa

man

belo

ngin

gto

perc

entil

eIh j

,for

som

e,se

lect

edj,

inth

eye

arof

the

wed

ding

(t,i

.e.

¸=

0)an

dat

thre

eye

ars

befo

reth

ew

eddi

ng(¸

=3)

,whe

ret

is20

11.

Iw 10

0;t≠

¸I

w 100;

t≠¸

Iw 10

0;t≠

¸I

w 100;

t≠¸

Iw 10

0;t≠

¸I

w 100;

t≠¸

Iw 10

0;t≠

¸I

w 100;

t≠¸

¸=

0¸

=0

¸=

1¸

=1

¸=

2¸

=2

¸=

3¸

=3

. . .. . .

. . .. . .

. . .. . .

. . .. . .

. . .I

h 50;t

≠¸

-0.0

13-0

.000

-0.0

030.

003

-0.0

10-0

.007

-0.0

11-0

.011

(0.0

16)

(0.0

18)

(0.0

13)

(0.0

14)

(0.0

12)

(0.0

15)

(0.0

13)

(0.0

16)

. . .. . .

. . .. . .

. . .. . .

. . .. . .

. . .I

h 98;t

≠¸

0.01

4*0.

010

0.02

2***

0.02

0***

0.01

00.

005

0.00

80.

004

(0.0

07)

(0.0

08)

(0.0

07)

(0.0

08)

(0.0

07)

(0.0

08)

(0.0

08)

(0.0

09)

Ih 99

;t≠

¸0.

046*

**0.

051*

**0.

035*

**0.

030*

**0.

041*

**0.

030*

**0.

036*

**0.

036*

**(0

.007

)(0

.008

)(0

.007

)(0

.008

)(0

.007

)(0

.008

)(0

.007

)(0

.008

)I

h 100;

t≠¸

0.09

7***

0.09

0***

0.08

2***

0.06

6***

0.09

4***

0.08

6***

0.07

0***

0.05

7***

(0.0

08)

(0.0

08)

(0.0

08)

(0.0

08)

(0.0

08)

(0.0

09)

(0.0

08)

(0.0

09)

Fixed-e�

ects

Mun

icip

ality

No

Yes

No

Yes

No

Yes

No

Yes

N.c

hild

ren

No

Yes

No

Yes

No

Yes

No

Yes

Age

No

Yes

No

Yes

No

Yes

No

Yes

Obs

erva

tion

s11

,062

11,0

6211

,062

11,0

6211

,062

11,0

6211

,062

11,0

62R

-squ

ared

0.02

90.

200

0.02

50.

197

0.02

50.

165

0.01

90.

177

Notes:

Res

ults

are

pres

ente

dfo

ra

sele

ctio

nof

Iw kan

dIh j

,nam

ely

for

the

med

ian

and

the

top

thre

epe

rcen

tile

grou

ps.

Estim

ates

for

di�e

rent

valu

esof

kan

dj

can

beob

tain

edup

onre

ques

tfr

omth

eau

thor

s.St

anda

rder

rors

inpa

rent

hese

s;**

*p<

0.01

,**

p<0.

05,*

p<0.

1.

28

Tabl

e7:

OLS

estim

ates

ofth

epr

obab

ility

ofa

wom

anbe

long

ing

tope

rcen

tile

grou

pIw k

gett

ing

mar

ried

toa

man

belo

ngin

gto

perc

entil

egr

oup

Ih j,f

orso

me,

sele

cted

kan

dj,

inth

eye

arof

the

wed

ding

(t,i

.e.

¸=

0)an

dat

thre

eye

ars

befo

reth

ew

eddi

ng(¸

=3)

,whe

ret

is20

11.

Inco

me

from

both

empl

oym

ent

and

self-

empl

oym

ent

(labo

urin

com

e).

Iw 50

;t≠

¸I

w 50;t

≠¸

Iw 98

;t≠

¸I

w 98;t

≠¸

Iw 99

;t≠

¸I

w 99;t

≠¸

Iw 10

0;t≠

¸I

w 100;

t≠¸

¸=

0¸

=3

¸=

0¸

=3

¸=

0¸

=3

¸=

0¸

=3

. . .. . .

. . .. . .

. . .. . .

. . .. . .

. . .I

h 50;t

≠¸

-0.0

02-0

.004

0.00

1-0

.023

0.00

4-0

.015

-0.0

17-0

.021

(0.0

11)

(0.0

13)

(0.0

13)

(0.0

15)

(0.0

14)

(0.0

15)

(0.0

12)

(0.0

13)

. . .. . .

. . .. . .

. . .. . .

. . .. . .

. . .I

h 98;t

≠¸

-0.0

01-0

.002

0.03

6***

0.00

50.

028*

**0.

049*

**0.

013*

0.01

0(0

.007

)(0

.009

)(0

.008

)(0

.010

)(0

.009

)(0

.010

)(0

.008

)(0

.009

)I

h 99;t

≠¸

-0.0

12*

-0.0

050.

049*

**0.

041*

**0.

042*

**0.

056*

**0.

042*

**0.

049*

**(0

.007

)(0

.009

)(0

.008

)(0

.010

)(0

.009

)(0

.010

)(0

.008

)(0

.009

)I

h 100;

t≠¸

-0.0

08-0

.001

0.02

9***

0.04

8***

0.07

6***

0.05

2***

0.06

8***

0.02

5***

(0.0

07)

(0.0

09)

(0.0

08)

(0.0

11)

(0.0

09)

(0.0

10)

(0.0

08)

(0.0

09)

Con

trols:

Mun

icip

ality

No

No

No

No

No

No

No

No

N.c

hild

ren

No

No

No

No

No

No

No

No

Age

No

No

No

No

No

No

No

No

Obs

erva

tion

s11

,062

11,0

6211

,062

11,0

6211

,062

11,0

6211

,062

11,0

62R

-squ

ared

0.01

10.

259

0.01

40.

166

0.02

50.

183

0.02

40.

165

Notes:

Res

ults

are

pres

ente

dfo

ra

sele

ctio

nof

Iw kan

dIh j

,nam

ely

for

the

med

ian

and

the

top

thre

epe

rcen

tile

grou

ps.

Estim

ates

for

di�e

rent

valu

esof

kan

dj

can

beob

tain

edup

onre

ques

tfr

omth

eau

thor

s.O

nly

coup

les

inw

hich

both

spou

ses

wer

ere

siden

tin

Lom

bard

yfo

rth

ree

year

sbe

fore

the

wed

ding

are

cons

ider

ed.

Stan

dard

erro

rsin

pare

nthe

ses;

***

p<0.

01,*

*p<

0.05

,*p<

0.1.

29

Table 8: OLS estimates of the probability of a woman belonging to percentile groupIw

100 getting married to a man belonging to percentile Ihj , for some, selected k and

j, at time t ≠ 3, where t is 2011, controlling for only wives’ and for both spouses’observable characteristics and both spouses’ individual fixed e�ects. Robustness checksby di�erent age groups.

Age 20-39 Age 30-49 Age 20-59

I

w100;t≠¸ I

w100;t≠¸ I

w100;t≠¸ I

w100;t≠¸ I

w100;t≠¸ I

w100;t≠¸

¸ = 0 ¸ = 3 ¸ = 0 ¸ = 3 ¸ = 0 ¸ = 3

......

......

......

...I

w50;t≠3 -0.013 -0.006 -0.007 -0.056* -0.001 0.035**

(0.023) (0.014) (0.030) (0.030) (0.013) (0.015)...

......

......

......

I

w98;t≠3 0.039*** 0.010 0.005 0.005 0.006 0.019***

(0.008) (0.008) (0.013) (0.012) (0.006) (0.007)I

w99;t≠3 0.030*** 0.035*** 0.045*** 0.042*** 0.042*** 0.036***

(0.008) (0.008) (0.013) (0.012) (0.006) (0.006)I

w100;t≠3 0.105*** 0.078*** 0.085*** 0.052*** 0.092*** 0.092***

(0.009) (0.008) (0.012) (0.013) (0.007) (0.007)Controls:Municipality Yes Yes Yes Yes Yes YesN. children Yes Yes Yes Yes Yes YesAge Yes Yes Yes Yes Yes Yes

Observations 11,687 11,687 6,086 6,086 16,344 16,344R-squared 0.189 0.161 0.215 0.213 0.148 0.135

Notes: Results are presented for a selection of Ihj , namely for the median and the

top three percentile groups. Estimates for di�erent values of j can be obtained uponrequest from the authors. Only couples in which both spouses were resident in Lom-bardy for three years before the wedding are considered. Standard errors in paren-theses; *** p<0.01, ** p<0.05, * p<0.1.

30

6 E�ects of assortative mating on inequality

Finally, we address the question of whether or not earning assortative mating

has an e�ect on income inequality. Greenwood, Guner, Kocharkov, and Santos

(2014) Eika, Mogstad, and Zafar (2014) and Hryshko, Juhn, and McCue (2017b)

showed that educational assortative mating increases household income inequal-

ity. Here we assess by how much income assortative mating a�ects household

income inequality measures and income shares, which are of particular interest

given the highly positive assortative mating at top income levels.

We focus on new couples in 2011 where both spouses are aged between 25

and 44 and at least one of them earns employment income, although the results

are broadly consistent between di�erent age groups and with those including

of self-employment income. The focus on newly-weds is because they repre-

sent a very small proportion of the total population (about 0.6% in each year)

and even large e�ects of assortative mating in one year would have a negligible

e�ect on the concentration of income in the overall distribution. These mea-

sures are provided for individual incomes, assuming income is shared equally in

each couple. Actual income inequality measures and shares are compared with

counterfactual measures obtained by randomly generating couples in which each

husband is matched with a randomly chosen wife, assuming that labour supply

and income are exogenous to household formation. This random allocation of

husbands to wives is repeated 1, 000 times, generating a counterfactual distri-

bution of randomly allocated couples.

Figure 5 shows results in which random mating income inequality indices and

share distributions are contrasted with actual figures, which are represented by

a vertical red line. The distributions of random mating inequality indices are

roughly symmetric and highly concentrated around the mean, showing that,

under random assortative mating, inequality would be about 10% lower.

31

The size of the assortative mating e�ect on income inequality seems to be

in line with, for example, that found by Eika, Mogstad, and Zafar (2014) and

well below 10%. In particular, administrative data show that the reason why

random assortative mating does not change the top income share is that a top-

earning spouse (often the husband in Italy) is su�cient to place the household

in the top income percentile group. Our data allow us to go deeper with the

counterfactual analysis and to show that the picture changes if one focuses on

household (employment) income when:

(a) both spouses belong to the top 1% of their gender income distributions;

(b) both spouses belong to the next 4% of their gender income distributions

(i.e. they both have incomes between the 95th and 99th percentile);

(c) both spouses belong to the next 5% of their gender income distributions

(i.e. they both have incomes between the 90th and 95th percentile);

(d) both spouses belong to the next 40% of their gender income distributions

(i.e. they both have incomes above the median and below the 90th per-

centile).

To facilitate interpretation, we normalised each income share by dividing it by

the random probability of falling into each group, i.e. (1/100 ◊ 1/100) for top

1%, (5/100◊5/100to1/100◊1/100) for the next 4%, (10/100◊10/100to5/100◊

5/100) for the next 5%, and (50/100 ◊ 50/100to10/100 ◊ 10/100) for the next

40%.

Figure 6 shows the results of plotting the distribution of 1, 000 di�erent

random mating simulations and the actual value depicted as a red vertical line.

Table 9 complements figures by showing the pointwise estimates. Panel (a)

shows that the actual employment income share of couples in which both spouses

are in their respective top income distribution is 61.4 larger than random. Had

32

Figure 5: Gini inequality index assuming income is shared equally in a couple.Actual value (blue vertical line) and histogram distribution of randomly allo-cated spouses (histograms, with 1, 000 di�erent random allocations of spouses,and mean value shown by a red line). Year 2011, both spouses aged 25-44.

��

��

��

��

��

��

&RXQWHUI��GLVWULEXWLRQ &RXQWHUI��PHDQ $FWXDO

*LQL

the marriage been independent of income, the average share would have been

much lower, around 11.3, i.e. over 80% lower, although with a large variance.

Panel (b) shows that, for couples with both spouses in the next 4%, the actual

employment income share is 16.2 times larger than would be expected with a

random distribution of income. The average under random mating would be

about one third of this.

33

Figure 6: Actual values (blue vertical line) and histogram distribution of ran-domly allocated spouses (histograms, with 1, 000 di�erent random allocationsof spouses, and mean value shown by a red line), normalised by the total prob-ability under uniform distribution. Year 2011, both spouses aged 25-44, 1, 000random allocations of spouses.

��

��

� ��

%RWK�VSRXVHV�LQ�WRS��D��7RS��FRXSOHV

��

��

� ��

%RWK�VSRXVHV�LQ��WK��WK�SHUFHQWLOH��E��1H[W��FRXSOHV

��

��

�

� ��

%RWK�VSRXVHV�LQ��WK��WK�SHUFHQWLOH��F��1H[W��FRXSOHV

��

��

��

��

%RWK�VSRXVHV�LQ��WK��WK�SHUFHQWLOH��G��1H[W��FRXSOHV

&RXQWHUI��GLVWULEXWLRQ &RXQWHUI��PHDQ $FWXDO

34

Table 9: Income shares with respect to random distribution, with actual andrandomly allocated couples in 1, 000 random matching for some selected incomegroups. Both spouses aged 25-44.

(a) Inequality

Actual RandomAssortative Mating

Mating mean st. dev.

Gini 0.345 0.322 0.001

(b) Top income shares

Actual RandomAssortative Mating

Mating mean st. dev.

Both spouses in top 1% 61.418 11.285 15.743Both spouses in 95th-99th percentile 16.235 6.128 1.491Both spouses in 90th-95th percentile 5.456 3.377 0.479Both spouses in 50th-90th percentile 1.655 1.828 0.033

Notes: Statistics computed based on the residents who got married in 2011:63, 734 individuals (31, 867 couples). Top-1% couples are those in which bothspouses belong to the top 1% of their respective gender income distribution; inthe next 4% of couples, both spouses have incomes between the 95th and 99th

percentiles; in the next 5% of couples, both spouses have incomes between the90th and 95th percentiles and, in the next 40% of couples, both spouses haveincomes above the median and below the 90th percentile. To facilitate inter-pretation, we divided each share by the probability of falling into each groupunder a uniform distribution, i.e. (1/100◊ 1/100) for the top 1%, (5/100◊5/100 to 1/100◊ 1/100) for the next 4%, (10/100◊ 10/100 to 5/100◊ 5/100)for the next 5%, and (50/100◊ 50/100 to 10/100◊ 10/100) for the next 40%.The random mating distribution is obtained by matching each woman to a ran-domly chosen man and replicating the process 1, 000 times, hence computingmean and standard deviation.

35

7 Concluding comments

In this paper, we exploited a novel dataset to assess income assortative mating

among newly-weds, which has seldom been analysed in the literature mostly

because datasets recording individual incomes typically do not measure the in-

comes of both spouses before marriage and therefore do not allow one to gauge

the extent of endogeneity caused by simultaneity. Thanks to Italian legislation

allowing us to identify new couples using tax records and taking advantage of

the longitudinal dimension of the data, which allows us to account for the si-

multaneity of the spouses’ incomes, we found that positive income assortative

mating is large and stable, especially at the top of the income distribution,

whereas there is no sign of significantly positive assortative mating for most of

the rest of the distribution. Our analysis cannot claim to determine causality,

although we controlled for a detailed set of municipality variables, as well as

other observable characteristics of both spouses, and assessed assortative mat-

ing at three years before the wedding, when the spouses most likely were not

promised to each other in marriage and, in some cases, had not even met each

other. Our results show that assortative mating is large at the top levels of in-

come, even when controlling for employment income, which is well known to be

relatively more equally distributed than capital income and wealth, for which,

unfortunately, we have no data. A similar approach can be used for countries

where administrative data and suitable institutional settings are available.

There might be several reasons why this happens. It is not surprising that

most people want to marry someone whose background, lifestyle, cultural, reli-

gious views and income are similar to their own. Assortative mating can be a

cause for celebration, particularly if shared backgrounds and interests result in

stronger and more stable relationships. However, our analysis of the e�ect of

income assortative mating on inequality suggests that it a�ects the distribution

36

of resources across the population and therefore might be a cause for concern.

For instance, it might exacerbate the problem of low social mobility and the

intergenerational transmission of adverse conditions. For instance, regarding

family formation, less well-o� families are likely to provide lower levels of edu-

cation to their o�spring and less educated people are much more likely to have

a child before marrying. Indeed, recent literature (e.g. Chetty, Hendren, Kline,

and Saez, 2014) shows that a child born in the bottom income quintile group

has a higher chance of remaining in that quintile as an adult and parents in the

bottom income quintile group are more likely than middle-income parents to

score among the weakest parents.

37

8 Appendix

8.1 The EMR over all percentiles

Figure 7 shows the three-dimensional profile of EMR for all percentile groups.

It shows that couples with no employment income are relatively less likely to

get married than couples with some income. Interestingly, it seems that the fre-

quency of top-income husbands getting married to zero-income women increases

slightly over time.

38

Year

2008

Year

2009

Year

2010

Year

2011

Figu

re7:

Unc

ondi

tiona

ljoi

ntin

com

edi

strib

utio

nby

perc

entil

egr

oup

ofea

chsp

ouse

,em

ploy

men

tin

com

eon

ly,

age

grou

p25

-44,

2008

-201

1.

39

8.2 Estimation of the average homogamy coe�cient

The sociological literature normally uses the term "homogamy" as synonymous

with assortative mating and describes changes in mating patterns using log-

linear models for contingency tables according to certain characteristics (e.g.

educational attainment or income category) (Agresti, 2002). These models focus

on estimates of the average association between couples’ characteristics, namely

the diagonal in the contingency table, controlling for shifts in the marginal

distributions.

This literature (e.g. Schwartz, 2010) usually estimates a model of average

assortative mating, which is fully saturated with the cross interactions between

the husband’s and wife’s characteristics. In the case of assortative mating based

on income, the model would be the following:

log(ctkj) = ⁄ +

ÿ

k

⁄

wk I

wk;t +

ÿ

j

⁄

hj I

hj;t +

ÿ

kj

⁄k,jI

wk;t · I

hj;t + ⁄HIH;t + ‘kj;t (5)

where, consistent with the notation used in Section 4, c

tkj is the number of

newly wed couples with a husband in percentile group j and a wife in percentile

group k in year t; I

sk;t is a binary indicator that takes a value equal to one if

the spouse s = w, h belongs to group {k}s, for k = 1, ..., 100; IH;t is a binary

indicator, which takes a value equal to one if the newly wed spouses belong to

the same percentile group; ⁄s are coe�cients to be estimated; and ‘kj;t is an

error term. The model (5) is estimated using a Poisson regression (e.g. Cameron

and Trivedi, 2013), and the homogamy coe�cient is ⁄H , which is the average

assortative mating along the main diagonal of the contingency table.

By taking into account the endogeneity resulting from simultaneity, as in

Section 4, and using the income percentiles of ¸ period earlier, we can estimate

the following:

40

log(ctkj) = ⁄+

ÿ

k

⁄

wk I

wk;t≠¸+

ÿ

j

⁄

hj I

hj;t≠¸+

ÿ

kj

⁄k,jI

wk;t≠¸ ·Ih

j;t+⁄HIH;t≠¸+‘kj;t≠¸

(6)

⁄H is a "homogamy term" added to estimate the average e�ect of belonging to

the same percentile group, which means lying on the diagonal of the contingency

table for both the spouses.

For computational reasons, instead of using percentile groups, we have used

ventile groups, i.e. equally frequent groups with one twentieth of the total

population in each group. In Table 10, we show the estimation of the probability

of having both newly wed spouses in the same income ventile for models 5 and

6 in the first and second column. It shows a significant homogamy probability

of lying on the diagonal with both ¸ = 0 and ¸ = 3, suggesting that newly-weds

are twice as likely to belong to the same income ventile as to belong to di�erent

ones and that the homogamy probability does not change significantly even if

simultaneity is accounted for.

Table 10: Odds ratio of models (5) and (6) estimated with Poisson models

log(ctkj) log(ct

kj)¸ = 0 ¸ = 3

Homogamy coe�cient ⁄H 1.929*** 1.961***(0.080) (0.140)

Observations 400 400

Notes: Only the homogamy coe�cient ⁄H is shown. The regressions includebut do not show all other coe�cients in the models.Standard errors in parentheses; *** p<0.01, ** p<0.05, * p<0.1.

41

References

Agresti, A. (2002): Categorical Data Analysis. John Wiley and Sons.

Arrondel, L., and N. Frémeaux (2016): “‘For Richer, For Poorer’: Assor-tative Mating and Savings Preferences,” Economica, 83(331), 518–543.

Atkinson, A. B. (2005): “Top incomes in the UK over the 20th century,” Jour-nal of the Royal Statistical Society: Series A (Statistics in Society), 168(2),325–343.

Atkinson, A. B., T. Piketty, and E. Saez (2011): “Top Incomes in theLong Run of History,” Journal of Economic Literature, 49(1), 3–71.

Autor, D. H. (2014): “Skills, Education, and the Rise of Earnings InequalityAmong the “Other 99 Percent”,” Science, 344(6186), 843–851.

Barban, N., E. De Cao, S. Oreffice, and C. Quintana-Domeque (2016):“Assortative Mating on Education: A Genetic Assessment,” Discussion paper,University of Oxford, Department of Economics Economics Series WorkingPapers.

Becker, G. (1973): “A Theory of Marriage: Part I,” Journal of PoliticalEconomy, 81(4), 813–46.

Bertrand, M., E. Kamenica, and J. Pan (2015): “Gender Identity andRelative Income within Households,” The Quarterly Journal of Economics,130(2), 571–614.

Bursztyn, L., T. Fujiwara, and A. Pallais (2017): “’Acting Wife’: Mar-riage Market Incentives and Labor Market Investments,” American EconomicReview, 107(11), 3288–3319.

Cameron, A., and P. Trivedi (2013): Regression Analysis of Count Data.Cambridge University Press.

Chetty, R., N. Hendren, P. Kline, and E. Saez (2014): “Where is theland of Opportunity? The Geography of Intergenerational Mobility in theUnited States,” The Quarterly Journal of Economics, 129(4), 1553–1623.

Chiappori, P.-A., and B. Salanié (2016): “The Econometrics of MatchingModels,” Journal of Economic Literature, 54(3), 832–61.

Cowell, F. (2011): Measuring Inequality. Oxford University Press.

Cowell, F. A., and M. P. Victoria-Feser (1996): “Robustness Propertiesof Inequality Measures,” Econometrica, 64, 77–101.

Daly, M. C., and R. G. Valletta (2006): “Inequality and Poverty in theUnited States: The E�ects of Rising Wage Dispersion of Men’s Earnings andChanging Family Behaviour,” Economica, 73, 75–98.

42

Eika, L., M. Mogstad, and B. Zafar (2014): “Educational AssortativeMating and Household Income Inequality,” NBER Working Papers 20271,National Bureau of Economic Research, Inc.

Fiorio, C. V. (2011): “Understanding Italian Inequality Trends,” Oxford Bul-letin of Economics and Statistics, 73(2), 255–275.

Frémeaux, N., and A. Lefranc (2017): “Assortative mating and earningsinequality in France,” Discussion Paper 11084, IZA DP.

Goldin, C., and L. F. Katz (2007): “The Race between Education andTechnology: The Evolution of U.S. Educational Wage Di�erentials, 1890 to2005,” NBER Working Papers 12984, National Bureau of Economic Research,Inc.

Gonalons-Pons, P., and C. R. Schwartz (2017): “Trends in EconomicHomogamy: Changes in Assortative Mating or the Division of Labor in Mar-riage?,” Demography, 54(3), 985–1005.

Greenwood, J., N. Guner, G. Kocharkov, and C. Santos (2014):“Marry Your Like: Assortative Mating and Income Inequality,” AmericanEconomic Review, 104(5), 348–53.

Hryshko, D., C. Juhn, and K. McCue (2017a): “Trends in earnings inequal-ity and earnings instability among U.S. couples: How important is assortativematching?,” Labour Economics, 48(Supplement C), 168 – 182.

(2017b): “Trends in earnings inequality and earnings instability amongU.S. couples: How important is assortative matching?,” Labour Economics,48(Supplement C), 168 – 182.

Katz, L., and D. Autor (1999): “Changes in the wage structure and earn-ings inequality,” in Handbook of Labor Economics, ed. by O. Ashenfelter, andD. Card, vol. 3, Part A, chap. 26, pp. 1463–1555. Elsevier, 1 edn.

Larrimore, J. (2014): “Accounting for United States Household Income In-equality Trends: The Changing Importance of Household Structure and Maleand Female Labor Earnings Inequality,” Review of Income and Wealth, 60(4),683–701.

Mare, R. D. (1991): “Five Decades of Educational Assortative Mating,” Amer-ican Sociological Review, 56(1), 15–32.

Pestel, N. (2017): “Marital Sorting, Inequality and the Role of Female LabourSupply: Evidence from East and West Germany,” Economica, 84(333), 104–127.

Piketty, T. (2001): Les Hauts revenus en France au 20e siècle : inégalitès etredistribution, 1901-1998. Grasset.

43

Piketty, T., and E. Saez (2003): “Income Inequality in the United States,1913-1998.,” Quarterly Journal of Economics, 118(1), 1.

Schwartz, C. (2010): “Earnings Inequality and the Changing Associationbetween Spouses’ Earnings,” American Journal of Sociology, 115(5), 1524–1557.

Schwartz, C. R., and R. D. Mare (2005): “Trends in educational assortativemarriage from 1940 to 2003,” Demography, 42(4), 621–646.

44

unimi.it - looking in your partner s p b income...

Documents