unimi.it - looking in your partner s p b income...
TRANSCRIPT
DIPARTIMENTO DI ECONOMIA, MANAGEMENT E METODI QUANTITATIVI
Via Conservatorio 7 20122 Milano
tel. ++39 02 503 21501 (21522) - fax ++39 02 503 21450 (21505) http://www.economia.unimi.it
E Mail: [email protected]
LOOKING IN YOUR PARTNER’S POCKET BEFORE SAYING “YES!" INCOME ASSORTATIVE MATING AND INEQUALITY
CARLO V. FIORIO STEFANO VERZILLO
Working Paper 2/2018
FEBRUARY 2018
FRANCESCO GUALA
Working Paper n. 2011-18
SETTEMBRE 2011
ARE PREFERENCES FOR REAL?
CHOICE THEORY, FOLK PSYCHOLOGY,
AND THE HARD CASE FOR COMMONSENSIBLE REALISM
FRANCESCO GUALA
Working Paper n. 2011-18
SETTEMBRE 2011
Looking in your partner’s pocket
before saying “Yes!"
Income assortative mating and inequality
�
Carlo V. Fiorio
†; and Stefano Verzillo
‡
February 19, 2018
AbstractIncome assortative mating has seldom been investigated in the lit-
erature, mostly because of endogeneity concerns related to simultaneityand omitted variable biases. Using the tax records of a major region inItaly for 2007 to 2011, we first show that income assortative mating ispresent mostly at very high income levels. In comparison with the me-dian woman, women belonging to the top 1% of their income distributionare about 25 times more likely to get married to men belonging to the top1% of their income distribution. Second, we show that, even when dealingwith simultaneity and omitted variable biases, assortative mating remainssignificantly larger for very high-earning couples. Finally, we assess thee�ect of income assortative mating on inequality by assuming randompairing. Our results are consistent with previous results showing that thee�ects are limited on income inequality as measured by a summary statis-tic, such as the Gini index. However, by exploiting the large size of ouradministrative dataset, we show that, when a bivariate partition of thepopulation is considered (e.g. couples with both spouses in the top 1%of their gender-specific distribution), the e�ect of assortative mating oninequality is huge, as, for the average counterfactual income determinedby assuming random pairing, the top income share would be reduced bymore than 80%.
�We would like to thank Joseph Altonji, Erich Battistin, Massimiliano Bratti, DanieleChecchi, Pierre-André Chiappori, Gianni De Fraja, Nico Pestel, Giovanni Pica, Erik Plug,Zhenchao Qian, Enrico Rettore, Frédéric Robert-Nicoud, Emmanuel Saez, Christine Schwartzand Daniel Waldenström for helpful comments and suggestions. Earlier versions were pre-sented at the SASE (Berkeley, 2016), ECINEQ (New York, 2017), IIPF (Tokyo, 2017), EALE(St. Gallen, 2017) conferences and seminars at UNIL (Lausanne), European Commission-JRC (Ispra), Univ. of Milan, Bocconi University and IRVAPP (Trento). The information andviews set out in this paper are those of the authors and do not reflect a policy position of theEuropean Union. Neither the European Commission nor any person acting on behalf of theCommission may be held responsible for the use that may be made of this publication.
†University of Milan and Irvapp-FBK. Address: Department of Economics, Managementand Quantitative Methods, Via Conservatorio 7, 20122 Milan; email: [email protected]
‡European Commission, Joint Research Centre (JRC) and the Interuniversity ResearchCentre on Public Services (CRISP), University of Milano-Bicocca.
1
JEL codes: J12,J41Keywords: assortative mating, administrative data, inequality, income
shares.
1 Introduction
The theoretical literature on assortative mating in the marriage market has de-
veloped extensively since the seminal paper by Becker (1973), predicting perfect
assortativeness (for a comprehensive survey, see Chiappori and Salanié, 2016).
The sociological literature has contributed to the analysis of assortative mat-
ing (often termed "homogamy"), focusing on trends over time (e.g. Mare, 1991;
Schwartz and Mare, 2005; Gonalons-Pons and Schwartz, 2017). The analysis
of assortative mating in the sociological literature mainly uses log-linear mod-
els for contingency tables (Agresti, 2002) to provide estimates of the changing
associations between couples’ characteristics while controlling for shifts in their
marginal distributions. Usually, marginal distributions refer to a characteristic
(e.g. education) of the wife and husband and the estimated models include a
dummy variable identifying couples with the same level of that characteristic
to test whether or not "homogamy" is significantly di�erent from zero, namely
mating is di�erent from random. A significant and positive coe�cient for the
dummy variable suggests positive assortative mating, i.e. like marries like.
When choosing the characteristics to be used for assessing homogamy, nat-
ural candidates would be social class or income; however, these are likely to be
endogenous. For instance, consider a newly wed couple in which the husband
belongs to a high social class (or has a high level of income) and marries a high
social class (or high-income) woman. It may be that the husband truly belongs
to the high social class or joined it only by marrying a high social class woman
who - because of marriage - shared family and social networks, economic re-
sources and business opportunities with him. Ideally, one would like to observe
2
whether or not newly-weds who appear similar in the year of marriage or after
would have been judged similar if they had been observed some years before
marrying, when they may not have known each other or been promised to each
other.
Even the best panel survey data (e.g. the Panel Study of Income Dynamics -
PSID), which follow a man (or woman) over a long period and often before and
after he (she) got married, record information on his (her) spouse only after she
(he) entered the panel by marriage, and little information is known from before
the wedding. Typically, the literature uses the highest educational attainment,
as this is available in most surveys, which is a�ected by limited measurement
error, and - in most cases - relates to a decision made some years before getting
married. Although this is not a perfect solution, as some people finish education
after marriage and others have further motivation to complete or not complete
education because they have been promised to someone, it allows one to reduce
or at least mitigate the extent of simultaneity, and hence endogeneity, in the
measurement of assortative mating.
This paper focuses on income assortative mating and its e�ects on inequal-
ity, focusing on newly-weds. We use a novel dataset, collecting all the tax forms
for the population of Lombardy, the richest region in Italy, with a population
close to 10 million, i.e. about as large as Portugal or Sweden and twice the
size of Denmark, over the period 2007-2011. The Italian institutional setting,
requiring individual tax filing and compulsory declaration of one’s marital part-
ner, allows us to identify the marital status of residents (i.e. single or in a
couple) and the year of marriage if it is between 2008 and 2011. We focus on
newly wed and single individuals (i.e. those who were on the marriage market
before each year considered), compute the percentiles of employment income
by gender and group them in 100 income groups (percentiles). Therefore, we
3
select newly-weds specifically and, by building the 100 ◊ 100 contingency ta-
ble of male and female percentiles, we show that assessing assortative mating
by estimating that the average probability of couples lies along the main di-
agonal of the contingency table would be misleading, as assortative mating is
mostly concentrated in the top levels of income, increasing steeply over the 95th
percentile.1 The unconditional frequency of newly wed couples in which both
spouses are in the top 1% of their respective gender distributions is about 25
times larger than that of newly wed couples in which at least one of the two
spouses has a median level of income. A similar pattern of unconditional assor-
tative mating was recently observed using European Union Statistics on Income
and Living Conditions (EU-SILC)2 data for France by Frémeaux and Lefranc
(2017), who focused mostly on potential earnings without addressing the issue
of endogeneity given the data structure. We address the endogeneity issue by
exploiting the panel dimension of our dataset, which allows us to determine the
income percentile group of all newly-weds in the income distribution up to three
years before marriage and we have used the same approach adopted so far in
the educational assortative mating literature. Specifically, we assess whether or
not positive assortative mating exists when controlling for the ranking of each
newly wed spouse before marriage, assuming that, for up to three years previ-
ously, newly-weds did not know each other or had not been promised to each
other.
Our paper contributes to the assortative mating literature by providing an
estimate of assortative mating based on income, which accounts for the bias due
to simultaneity and a large number of possible confounding factors. Although
the lack of a suitable instrument prevents us from making a causal interpretation1The average newlywed husband at the 95th percentile earns e39,600, about 20% more
than the average wife in the 95th percentile.2EU-SILC is an instrument aiming to collect timely and comparable cross-sectional and
longitudinal multidimensional microdata on income, poverty, social exclusion and living con-ditions across the European Union since 2004 (https://tinyurl.com/nspxlmq).
4
of our results,3 by exploiting the balanced panel structure of our dataset for all
newly-weds, we are able to provide an estimate of the extent of the simultaneity
bias.
Our results show, first, that assortative mating at the top levels of income
remains large even when newly-weds are observed three years before the wedding
takes place and, second, that, at the top levels of income, simultaneity bias is
negligible even when dealing with time-invariant individual characteristics such
as family background or education. These results are shown for newly-weds
where both spouses are aged between 25 and 44, but robustness analyses broadly
confirm the picture and suggest that income assortative mating also increases
for older newly-weds.
The analysis of assortative mating has recently also gained significance in
the economic literature and has been investigated as a possible additional driver
of household inequality. Although inequality as measured by indices looking at
the whole distribution (such as the Gini and Theil indices, or coe�cient of vari-
ations) presents a mostly increasing trend, albeit with large variability between
countries and over time, recent literature (e.g. Atkinson, Piketty, and Saez,
2011, and the World Wealth & Income Database-WID project4) has devoted a
lot of attention to what happens in the top 1% (or above) of the income dis-
tribution, highlighting the increasing contribution of wages and salaries. The
increase in college premiums and the role of phenomena such as minimum wages
and unions are often regarded as the main factors explaining increased inequality
(e.g. see Goldin and Katz, 2007; Autor, 2014; Katz and Autor, 1999). Changing
female labour force participation and changing labour earnings distributions be-
tween households have also been analysed as factors driving inequality changes3For one of the very few contributions using an instrumental variable approach, see Barban,
De Cao, Ore�ce, and Quintana-Domeque (2016), which was unfeasible in our context for lackof data.
4The comprehensive and constantly updated dataset of the WID project can be found athttp://wid.world/.
5
(e.g. Daly and Valletta, 2006; Fiorio, 2011; Larrimore, 2014). Recent litera-
ture has addressed whether or not educational assortative mating has played a
role in explaining increased inequality, mostly by estimating the Gini index and
contrasting it with the counterfactual situation where spouses were randomly
matched (e.g. Eika, Mogstad, and Zafar, 2014; Greenwood, Guner, Kocharkov,
and Santos, 2014; Hryshko, Juhn, and McCue, 2017a; Pestel, 2017). This lit-
erature concludes that assortative mating contributes positively to household
inequality, although its contribution is minor, as the Gini index is ’only’ 10%
higher than the counterfactual situation.
Our paper’s third and final contribution to the existing literature is to show
that positive assortative mating has a strong impact on the concentration of
income. We focus on the e�ect of assortative mating on household income dis-
tribution in the population of newly-weds, assuming equal sharing of resources
between the couple, and comparing the actual distribution with the distribu-
tion of counterfactual households obtained by randomly allocating spouses 1, 000
times. Results suggest that the very large positive assortative mating e�ect ob-
served for the top levels of income has a modest inequality-increasing e�ect, not
larger than 10% of the Gini index, consistent with the results of other authors
in di�erent institutional settings, as cited above. However, we suggest that this
conclusion is driven by the use of summary measures such as the Gini index,
which are mostly sensitive to changes in the bulk of the income distribution
(Cowell, 2011). Using alternative measures, such as those from the family of
Atkinson indices with large inequality aversion coe�cients or from the fam-
ily of generalised entropy with large "–-coe�cients" (i.e. where large weight is
given to di�erences between incomes at the top of the distribution), would be
problematic, as these indices are very sensitive to outliers (Cowell and Victoria-
Feser, 1996). Along the lines of the literature initiated by Piketty and colleagues
6
(Piketty, 2001; Piketty and Saez, 2003; Atkinson, 2005), we focus on the income
of top earners and, focusing on all newly-weds aged 25 ≠ 44 earning wage or
salary (employment) income, we find that couples in which both spouses are
in the top 1% of their respective income distributions earn over 61 times more
than if income were equally distributed.
The rest of the paper is as follows. In Section 2, we describe the data and
present the selection rules adopted. In Section 3, we explain how we organised
the data to show income assortative mating and present the excessive assortative
mating distribution. In Section 4, we present the endogeneity issues for income
assortative mating and describe how we addressed them and, in the following
Section 5, we present the main results and robustness checks. In Section 6, we
compute the e�ect of assortative mating on inequality and, in Section 7, we
discuss our results.
2 The data
In this paper, we explore the full administrative dataset of tax forms for a major
Italian region (Lombardy) over the period 2007-2011, which, at present, is the
longest time span available. This is the richest region of Italy, with about 10
million residents, as large as Portugal and twice the size of Denmark. Our
dataset is based on administrative data,5 allowing the exact identification of
tax units.
Italian tax legislation requires the taxpayer to declare the tax identifier of
all fiscally dependent relatives (i.e. with gross earnings below e2, 840) and of
their spouses, regardless of their income.
As shown in Table 1, we have annual information on approximately 10 million5Data were analysed as part of a research programme between the Interuniversity Re-
search Centre on Public Services (CRISP) at the University of Milano-Bicocca and the Taxand Income Department of the Lombardy Region. Tax records were analysed having beenanonymised using an irreversible hashing algorithm.
7
residents, of whom about 40% live as legally married couples.6 Approximately
1.7% of all couples are new couples, i.e. couples that got married within that
year. In 2011, the number of residents who got married was about 63.7 thou-
sand, showing a decreasing trend over the period studied (66, 210 in 2008), for
reasons that we can only guess but might be due to increased job instability,
economic crisis or changing values. The Italian National Statistical Institute
(Istat; http://www.istat.it/), using registry data, reports very similar numbers
of residents as couples and newly-weds in Lombardy over the same period, thus
confirming the validity of the tax dataset we use.
Given our focus on assortative mating, in what follows we focus on newly
wed couples in which both spouses are between 25 and 44, which is the age
group of those who should have finished their educational cycle, are in their
best fertile years and we expect are less likely to base their choice of marital
partner on income.7 Table 1 shows that, out of a total of 31, 867 couples (63, 734
individuals) with both spouses aged 25-44 in 2011, about 75% of all new couples
in 2011 belonged to this age group. By focusing on 2011 and splitting the
population by gender, Table 2 shows that, on average, women get married at
the age of 32 and men at 34 (panel a). Labour-active newly wed women earn
an average wage or salary (employment) income that is slightly larger than that
of the average active woman in this age group, whereas newly wed men earn
nearly 20% more than the average man in the same age group (panel b). Total
annual labour earnings of the average self-employed man are close to those of
the average male employee, while the representative newly wed self-employed
woman earns on average 23, 000 euros, 3, 000 euros more than the average newly
wed female employee (panel c).6Tax form data allow the identification of legally married couples only, as taxpayers in
unmarried cohabiting couples are not allowed to claim for dependent spouse tax credit andare not obliged to declare their partner. These data also do not allow identification of secondmarriages as opposed to first marriages.
7This a priori expectation will be tested in the robustness section 5.1.
8
The proportion of self-employed people with positive income in the 25 ≠ 44
age group is limited, comprising fewer than 5% of women and about 10% of
men, with most people in this age group working as employees (Table 3).
Table 1: Total number of residents in Lombardy by year and marital status.
2008 2009 2010 2011
All ages All residents 10,466,319 9,908,071 9,996,217 9,794,060All ages In a couple 3,836,995 3,870,915 3,875,486 3,837,662All ages In a new couple 66,210 62,998 62,712 63,734Aged 25-44 All residents 3,234,338 2,949,590 2,915,903 2,750,288Both aged 25-44 In a couple 1,342,160 1,310,950 1,261,876 1,207,160Both aged 25-44 In a new couple 49,166 48,102 47,926 47,626
Notes: Individual observations. Couples are legally married couples only. New couplesare couples that were not married the year before.
9
Tabl
e2:
Som
ede
scrip
tive
stat
istic
sby
gend
erfo
rin
divi
dual
sag
ed25
-44
for
2011
.
(a)
Age
Wom
enM
enO
bsM
ean
Std.
Dev
.O
bsM
ean
Std.
Dev
.A
llre
siden
ts1,
357,
361
35.6
65.
21A
llre
siden
ts1,
392,
927
35.7
05.
35In
aco
uple
669,
745
37.0
94.
87In
aco
uple
537,
415
38.0
24.
43In
ane
wco
uple
23,8
1331
.98
4.45
Ina
new
coup
le23
,813
34.3
14.
55
(b)
Empl
oym
ent
inco
me
(wag
ean
dsa
larie
s)W
omen
Men
Obs
Mea
nSt
d.D
ev.
Obs
Mea
nSt
d.D
ev.
All
resid
ents
939,
450
18,6
95.2
044
1,49
5.50
All
resid
ents
1,09
2,87
425
,733
.93
1,44
4,90
1.00
Ina
coup
le42
8,71
918
,959
.81
16,0
35.8
5In
aco
uple
409,
860
30,7
48.7
266
,541
.22
Ina
new
coup
le17
,381
20,3
34.6
312
,368
.64
Ina
new
coup
le17
,541
30,3
22.6
698
,524
.67
(c)
Self-
empl
oym
ent
inco
me
Wom
enM
enO
bsM
ean
Std.
Dev
.O
bsM
ean
Std.
Dev
.A
llre
siden
ts52
,535
24,5
47.8
736
,383
.93
All
resid
ents
132,
426
28,9
04.1
538
,246
.40
Ina
coup
le27
,192
25,0
78.3
833
,170
.84
Ina
coup
le69
,852
31,6
03.9
442
,715
.91
Ina
new
coup
le1,
151
23,0
64.9
123
,944
.58
Ina
new
coup
le3,
029
29,2
84.0
936
,493
.21
Not
es:
Indi
vidu
alob
serv
atio
ns.
Cou
ples
are
lega
llym
arrie
dco
uple
son
ly.
New
coup
les
are
coup
les
that
were
notm
arrie
dth
eye
arbe
fore
.Em
ploy
men
t(se
lf-em
ploy
men
t)in
com
est
atist
icsa
repr
esen
ted
only
fort
hesa
mpl
eof
thos
ew
ithpo
sitiv
eem
ploy
men
t(s
elf-e
mpl
oym
ent)
inco
me.
10
Tabl
e3:
Prop
ortio
nsof
indi
vidu
als
with
zero
inco
me
byge
nder
for
indi
vidu
als
aged
25-4
4.20
11on
ly.
(a)
Prop
ortio
nof
zero
empl
oym
ent
inco
me
Wom
enM
en
Obs
Mea
nSt
d.D
ev.
Obs
Mea
nSt
d.D
ev.
All
resid
ents
1,35
7,36
10.
308
0.46
2A
llre
siden
ts1,
392,
927
0.21
50.
411
Ina
coup
le66
9,74
50.
360
0.48
0In
aco
uple
537,
415
0.23
70.
425
Ina
new
coup
le23
,813
0.27
00.
444
Ina
new
coup
le23
,813
0.26
30.
440
(b)
Prop
ortio
nof
zero
self-
empl
oym
ent
inco
me
Wom
enM
en
Obs
Mea
nSt
d.D
ev.
Obs
Mea
nSt
d.D
ev.
All
resid
ents
1,35
7,36
10.
957
0.20
3A
llre
siden
ts1,
392,
927
0.89
90.
301
Ina
coup
le66
9,74
50.
955
0.20
7In
aco
uple
537,
415
0.86
40.
343
Ina
new
coup
le23
,813
0.94
70.
224
Ina
new
coup
le23
,813
0.86
70.
340
Not
es:
Indi
vidu
alob
serv
atio
ns.
Cou
ples
are
lega
llym
arrie
dco
uple
son
ly.
New
coup
les
are
coup
les
that
were
not
mar
ried
the
year
befo
re.
11
3 The Excessive Mating Ratio
If assortative mating were not correlated with income, one would expect that
the likelihood of a rich man getting married to a poor woman would be the
same as that of a rich man getting married to a rich woman. To investigate
assortative mating based on income, out of the population of newly wed residents
we selected the set of people who in year t were either single or just married (i.e.
they were on the marriage market in year t ≠ 1), grouped them by gender and
year, ranked each of them into 100 gender-specific percentile groups (obtained
by the 99 numbered points that divide the ordered set of income into 100 parts,
each of which contains one-hundredth of the total) and built a 100 ◊ 100 matrix
of frequency counts for each year t. The cell {k, j}t identifies couples with a
wife in the k-th wives’ income percentile and the husband in the j-th husbands’
income percentile in year t, where {k, j = 1, ..., 100} and t = 2008, ..., 2011.
For instance, the cell {100, 100}2011 contains 2011 newly wed couples with both
spouses belonging to the top 1% of their gender-specific income distribution and
the cell {50, 100}2011 contains couples with a husband in the top 1% and a wife
whose income is between the 49th percentile and the median.
The observed probability that a newly wed couple will fall into the cell
{k, j}t is the number of newly wed couples in cell {k, j}t, c
tk,j , divided by the
total number of newly wed couples,q
k,j c
tk,j . This allows us to compute what
we call the Excessive Mating Ratio (EMR) for year t, which is the ratio of the
observed mating probability over the theoretical probability of mating under
the assumption of random mating:
12
EMR
tkj := Actual relative frequency of couples in cell {k, j}t
Theoretical relative frequency under random mating
=Prob(w = k, h = j|married in year t)Prob(w = k) ◊ Prob(h = j) =
c
tkj/
qk,j c
tk,j
(1/100) ◊ (1/100) . (1)
If the EMR
tkj = 1 for all k, j, where k, j = 1, 2, ..., 100, there would be no
assortative mating based on income in year t, meaning that people get married
for other reasons (e.g. love, randomness), which are not correlated with income.
When the EMR
tkj > 1, this means that the observed frequency of couples ckj
exceeds the theoretical probability, i.e. there is positive assortative mating.
When, instead, EMR
tkj < 1, it means that the observed frequency of couples
ckj is exceeded by the theoretical probability, i.e. there is negative assortative
mating.
A preliminary descriptive analysis of the level of activity and levels of earn-
ings of newly-weds, discussed in the previous section, led us to take some deci-
sions regarding data selection for couples with both spouses in the 25 ≠ 44 age
group. First, we focus on new couples formed during 2008-2011 and drop from
the analysis all records of residents who remained single during the observed
period or got married before 2008. Second, as self-employment income is, to a
large extent, self-declared and its measurement error is likely to be large, we
use employment income only, which is third-party reported with virtually no
under-reporting concerns. Similarly, capital income is excluded from the anal-
ysis because financial capital income taxation is often fully paid at source and
is therefore not declared in tax forms, and real estate and building property in-
come is measured noisily, as data do not allow us to disentangle imputed income
based on cadastral values from actual rental income. All income measures are
based on market income, i.e. before all taxes and benefits.
13
Figure 1 plots on a 3D surface the average of the EMR over the period
considered, EMRkj =q2011
t=2008 EMR
tkj/4, where only the (gross) employment
income percentile groups of both spouses above the median are shown. It shows
that a marriage between people in very distant percentile groups (e.g. a man
with median income and a woman in one of the top percentiles, and vice versa) is
relatively unlikely, whereas positive assortative mating based on labour income
is relatively more frequent. What appears as a striking pattern is the increase
of positive assortative mating as the level of income increases above the 90th
percentile. In particular, the frequency of couples with both spouses in the top
1% of each gender-specific employment income distribution is about 25 times
more likely to occur than couples in which both spouses have a median income.8
Interestingly, this pattern is a constant feature over time. Figure 2 plots the an-
nual unconditional EMR, showing the persistence of positive assortative mating
at top income levels for t = 2008, ..., 2011. It shows that the observed pattern
is consistent over time.9
Using Bayes’ rule, we can decompose the EMR in equation (1) as the product
of the probability of a woman who got married in year t to a man in percentile
group j belonging to percentile group k times the probability of a newly wed man
belonging to percentile group j, normalised by the random mating probability,
as follows:
EMR
tkj := Prob(w = k|h = j, married in year t)
Prob(w = k) ◊Prob(h = j|married in year t)Prob(h = j)
(2)8Interestingly, there are some gender di�erences in income assortative mating as the share
of top 1% newly wed males who get married to a top 10% woman is equal to 30%, whereasthe top 1% newly wed females who get married to a top 10% man is equal to 53%.
9Figures 1 and 2 show results for couples in which both spouses have at least a medianincome for clarity reasons, although analogous figures for all percentile groups can be foundin the Appendix, Subsection 8.1.
14
Figure 1: Unconditional joint income distribution by percentile group of eachspouse, employment income only, age group 25-44, both spouses above the me-dian. Mean 2008-2011.
Figure 3 plots the unconditional distribution of a newly wed man who got
married in 2011 belonging to percentile j, i.e. Prob(h = j|married in year t) for
all percentile groups, and contrasts it with the analogous distribution for newly
wed women. A value equal to 1 in a particular percentile group means that
the proportion of newly-weds in that percentile group is as large as random,
whereas a value lower than 1 means that it is relatively unlikely for newly-weds
to be in that income group, and vice versa for values above one. Focusing on
people with positive income10, it shows that the likelihood that newly wed
women have an income in the 30th percentile of the gender-specific income
distribution is about 50% lower than for women in the 60th percentile, and the
likelihood that newly wed men have incomes in about the 30th percentile is 75%
lower than for those in the 70th percentile. It also shows that the probability
increase is approximately linear for both genders up to the 80th percentile for10About 20% of newly wed men and 25% of newly wed female have zero income, though
they both have a higher likelihood to get married than newlyweds with positive but bottomincome, possibly because their potential income is larger.
15
Year
2008
Year
2009
Year
2010
Aver
age
EMR
over
2008
-201
1
Figu
re2:
Unc
ondi
tiona
ljoi
ntin
com
edi
strib
utio
nby
perc
entil
egr
oup
ofea
chsp
ouse
,em
ploy
men
tin
com
eon
ly,
age
grou
p25
-44,
both
spou
ses
abov
eth
em
edia
n,20
08-2
011.
16
women and the 70th percentile for men; at this point they diverge, increasing
for men and decreasing for women. Male newly-weds are in the top 1% of
income 2.4 times more often than random, whereas female newly-weds are only
as often as random. Male newly-weds have employment income below the 20th
percentile more often than random, although one should recall that this group
includes newly-weds who are only self-employed, as well as the group of newly-
weds who are temporarily out of employment, possibly because of temporary
unemployment or education.
The strikingly low probability of getting married for people with positive but
low levels of income is consistent with the low probability of marriage for people
with low levels of education, which has also been found in other countries (e.g.
Schwartz and Mare, 2005).
The reason for the diverging patterns of representation of di�erent employ-
ment income percentile groups among male and female newly-weds can be at-
tributed to both demand for and supply of women on the marriage market.
Bertrand, Kamenica, and Pan (2015), using US data, recently argued that gen-
der identity norms may explain the aversion to situations where the wife earns
more than her husband and this a�ects getting married, the division of home
production, marriage satisfaction and divorce rate. In a field experiment, Bursz-
tyn, Fujiwara, and Pallais (2017) show that single female students are less likely
to report their ambitions, such as desired salaries and willingness to travel and
work long hours, when observed by their potential partners. On one hand,
high-earning women might be less attractive on the marriage market, as they
could threaten men’s identities and signal that they are less inclined to compro-
mise their career in favour of their husband’s. On the other hand, high-income
women might intentionally choose not to get married, as gender identity norms
could make them less likely to continue working, unsatisfied with family life and
17
Figure 3: Frequency of newly wed spouses in 2011 by percentile group of em-ployment income and by gender three years before the wedding. Age group25-44.
���
����
����
� �� �� �� �� �� �� �� �� �� ���*HQGHU�VSHFLILF�FHQWLOH�JURXS
:RPHQ����� 3URE��Z� �N�_�QHZO\ZHG�0HQ����� 3URE��K� �M�_�QHZO\ZHG�
<HDU�����$JH�JURXS���������
likely to be working more at home.
18
4 Empirical modelling of assortative mating based
on income
In this section, we provide an estimate of the conditional probability of a woman
belonging to percentile group k given that she got married at time t to a man
belonging to percentile group j, i.e. the first factor of (2).
The aim of this empirical investigation is to assess if the strong association
between wives’ and husbands’ incomes found in the top percentile remains,
even when controlling for possible confounding factors and dealing with the
co-determination (simultaneity) of each newly-wed’s income at the time of the
wedding.
The analysis starts by computing the q-th income percentiles of the income
distribution in year t of men who were eligible for marriage (i.e. unmarried
in year t ≠ 1 ) as qh := Pr[yh < qh] Æ q/100 and, similarly for women, qw,
where {q = 1, ..., 100}. We denote as {j}h the percentile group of husbands
whose incomes are between the j-th and the (j ≠ 1)-th percentile and as I
hj,i,
which is a binary indicator that is equal to one if the husband in couple i
belongs to group {j}h (i.e. qh≠1 < yh < qh) and I
hji = 0 otherwise. Hence,
using a similar notation for wives, we provide an estimate of Prob(w = i|h =
j, married in year t) of the population of N newly wed couples in year t:
I
wi,k;t =
ÿ
j
—j,kI
hij;t + ui,k;t (3)
where i = {1, ..., N}, k = {1, ..., 100}, I
h is the vector indicating to which
quantile group w’s husbands belong to, i.e. I
hÕ = [Ih1 , ..., I
h100], —k are vector of
coe�cients to be estimated and ui,k is the unobserved error term.
Model (3) is a linear probability model, which we estimate by ordinary least
squares (OLS) for all wives’ quantile groups, k = {1, ..., 100}, although our at-
19
tention will focus on top quantiles. In fact, the estimation of {—k} in (3) is likely
to be a�ected by simultaneity bias, which would lead to an overestimation of
the true parameter. This is because the decision to get married is co-determined
between spouses, and the husband’s income ranking in the year of the wedding,
I
hij;t, is likely to correlate with the unobserved characteristics of the wife, which
are included in ui,k;t. One can think of a wedding as being decided ¸ years
before t or even earlier and, since then, the fiancés might decide to change their
behaviour (e.g. labour market participation) in preparation for the wedding.
For instance, as a wedding is often a preparation for having children, the future
spouses might exert more e�ort in advance, working harder before the wedding
to get a promotion and enjoy better prospects during maternity leave, or they
might share each other’s network of contacts, improving each other’s chances
of getting a better job. Therefore, to limit the e�ect of the simultaneity bias,
we exploited as much as possible the panel dimension of our dataset, estimat-
ing the assortative mating coe�cient by looking at each newly-wed’s position ¸
years before marriage, where very large values of ¸ would make I
s, for s = w, h,
largely correlated with the family of origin background and less correlated with
the future spouse’s own income (possibly not yet known at time ¸).
I
wi,k;t≠¸ =
ÿ
j
—j,kI
hij;t≠¸ + ui,k;t≠¸, (4)
If the simultaneity bias were zero, the estimated — coe�cients in (3) and (4)
would be the same.
A visual depiction of the average mobility of wives and husbands for the three
years before the wedding is shown in Figure 4, providing illustrative evidence
that the anticipation e�ect in preparation for the wedding is larger between the
40th and 60th percentile groups for both partners, while it is almost null for
those in the top levels of their income distributions.
20
Figu
re4:
Aver
age
mob
ility
ofin
com
eby
gend
er¸
year
sbe
fore
mar
riage
,age
grou
p25
-44.
Mar
riage
year
2011
.Sc
atte
rpo
ints
whe
n¸
=3
and
loca
llywe
ight
edre
gres
sions
for
¸=
1,2,
3.
���������
<HDU�W
��
��
��
���
<HDU�W���ODJ
:LYHV
���������
<HDU�W��
��
��
���
<HDU�W���ODJ
+XVEDQGV
%LVHFWRU
ODJ� ��
��
ODJ� ��
ODJ� ��
ODJ� ��
21
5 Empirical estimation of assortative mating based
on income
In this section, we present the results of estimating the likelihood that a newly
wed woman in income percentile group k gets married to a man in income
percentile group j. However, given the picture of the joint distribution shown
in Section 3, our interest mainly focuses on the subgroup —k, {for k = 1, ..., 100}
and results will be presented only for the median group and for the top percentile
groups, i.e. for k = 50, 98, 99, 100.
In what follows, we consider only members of newly wed couples, although
the income percentile group to which the newly wed spouses belong is iden-
tified as above, namely based on the population of newly wed and unmarried
individuals by gender in the same age group. As we will also provide estimates
of assortative mating when each spouse’s ranking in the income distribution is
observed up to ¸ = 3 years before the wedding, as in (4), we will focus only on
newly wed couples in 2011. Table 4 reports some summary statistics by gender
for newly-weds in the 25-44 age group. It shows that, out of the 23, 813 women
who got married in 2011, 203, 258, 281 and 274 were in the 50th, 98th, 99th
and 100th percentile groups, respectively. The average employment income of
the top 1% of newly wed wives is more than 6 times the mean income of me-
dian newly wed wives. Newly wed husbands are relatively more frequent in the
top percentile groups. We found 493, 503 and 568 newly wed husbands in the
98th, 99th and 100th percentile groups, respectively, whereas only 139 newly wed
husbands have an income at the median level for this age group. The income
of newly wed husbands is higher than that of equally ranked women and the
di�erence is very large in the top 1%, where the right-hand tail of the income
distribution is very thick.
22
The mean employment income of wives in the top percentile group is about
e78, 000 and that of husbands in the top group is more than twice that.
In the tables that follow, we used the smallest estimation sample for the
specifications (3)-(4) to allow comparison of results between the di�erent spec-
ifications and understanding of the direction of the simultaneity and omitted
variable biases.
Table 5 provides the estimates of the basic specifications (3) and of the
specification (4) with ¸ = 3 for each quantile k = {50, 98, 99, 100}. The first
two columns show the probability of a wife being in the 50th percentile group
given the husband’s percentile group, and a coe�cient that is not significantly
di�erent from zero is estimated.
The following columns suggest that the likelihood of a newly married wife
being in the top percentile group is highly correlated with the probability of her
husband being in the top percentile group. In particular, the next-to-last column
shows that the probability of a newly married wife being in the top percentile
group is 10% higher if she gets married to a top-1% husband, according to
specification (3). By dealing with the simultaneity bias, as in (4), the correlation
in the top percentile groups is reduced to 7% with a high degree of statistical
significance.
Table 6, focusing on wives in the top 1% only, allows an assessment of the
size of the simultaneity bias and the e�ect of including some controls, such
as age, number of children and municipality of residence of each spouse, the
latter consisting of about 1, 500 fixed e�ects. Comparing results derived using
no controls and with increasing lags, namely with ¸ = 0, 1, 2, 3, one can observe
that the —100,100 coe�cient is reduced from 0.097 for ¸ = 0 to 0.070 for ¸ = 3,
suggesting that the simultaneity bias acts in the expected direction but also that
it is not large enough at the top to explain the high level of assortative mating
23
observed in the unconditional joint distribution. Control variables generally
increase the explanatory power of the model and slightly reduce the magnitude
of the assortative mating coe�cient, although maintaining its high statistical
significance. The coe�cient —100,100 estimated in the last column of Table 6
suggests that a woman who got married in 2011 to a man who was in the top
1% three years earlier, at a time when they probably had not yet committed
to marriage, was about six times more likely to belong to the top 1% of her
distribution than the median woman. One should also remember that this
estimation accounts for the first part of (2) and, to obtain the unconditional
EMR, one should multiply it by about 2.5, which is the probability of a married
man belonging to the top 1%.
24
5.1 Robustness checks
Results are robust to di�erent definitions of income. Table 7 has the same
structure as Table 5, the only di�erence being that income is now defined as
the sum of self-employment income plus wages and salaries. Considering self-
employment income in addition to employment income allows us to take into
account total labour income at the expense of increasing measurement error,
as self-employment income is largely a�ected by tax-avoiding behaviours. As
expected, coe�cients are lower, as measurement error causes an estimation bias
towards zero, although they remain highly significant at the top levels of income.
Table 8 shows the robustness of results to di�erent age selections. The age
groups considered are couples in which both spouses are five years younger than
in the main regressions (i.e. 20-39), older (i.e. 30-49) or in a larger age interval
(i.e. 20-59). Focusing on the estimated coe�cient (—̂100,100), one can observe
that the probability of a top-percentile wife marrying a top-percentile husband
remains highly significant, and pointwise coe�cient estimates tend to increase
when older couples are included, consistent with our a priori expectations.
Overall, these results suggest that income is a significant factor in partner
selection, although evidence for this is found only at the very top levels of
income.
25
Tabl
e4:
Som
est
atist
ics
bya
sele
cted
grou
pof
empl
oym
ent
inco
me
perc
entil
es.
All
indi
vidu
als
are
aged
25-4
4.20
11on
ly.
Empl
oym
ent
inco
me
Wiv
esH
usba
nds
Perc
entil
egr
oup
5098
9910
0Pe
rcen
tile
grou
p50
9899
100
N.O
bs.
203
258
281
274
N.O
bs.
139
493
503
568
Mea
n11
,931
42,2
6050
,564
77,8
51M
ean
16,3
2752
,154
62,7
9216
3,22
7M
inim
um11
,700
39,9
0045
,500
57,0
00M
inim
um16
,151
48,8
8056
,280
72,1
49M
axim
um12
,100
45,5
0056
,800
omitt
edM
axim
um16
,479
56,2
5271
,988
omitt
edN
otes
:A
llm
onet
ary
figur
esar
ein
curr
ent
euro
s.M
inim
uman
dm
axim
umva
lues
have
been
roun
ded
toth
ene
area
sthu
ndre
adth
.T
hew
hole
sam
ple
incl
udes
the
23,81
3wo
men
and
23,81
3m
enw
hogo
tm
arrie
din
2011
.
26
Tabl
e5:
OLS
estim
ates
ofth
epr
obab
ility
ofa
wom
anbe
long
ing
tope
rcen
tile
Iw kge
ttin
gm
arrie
dto
am
anbe
long
ing
tope
rcen
tile
Ih j,
for
som
e,se
lect
edk
and
j,in
the
year
ofth
ew
eddi
ng(t
,i.e
.¸
=0)
and
atth
ree
year
sbe
fore
the
wed
ding
(¸=
3),w
here
tis
2011
.
Iw 50
;t≠
¸I
w 50;t
≠¸
Iw 98
;t≠
¸I
w 98;t
≠¸
Iw 99
;t≠
¸I
w 99;t
≠¸
Iw 10
0;t≠
¸I
w 100;
t≠¸
¸=
0¸
=3
¸=
0¸
=3
¸=
0¸
=3
¸=
0¸
=3
. . .. . .
. . .. . .
. . .. . .
. . .. . .
. . .I
h 50;t
≠¸
-0.0
10-0
.010
-0.0
11-0
.014
-0.0
17-0
.017
-0.0
13-0
.011
(0.0
14)
(0.0
11)
(0.0
17)
(0.0
14)
(0.0
18)
(0.0
14)
(0.0
16)
(0.0
13)
. . .. . .
. . .. . .
. . .. . .
. . .. . .
. . .I
h 98;t
≠¸
-0.0
03-0
.006
0.03
7***
0.01
5*0.
031*
**0.
037*
**0.
014*
0.00
8(0
.006
)(0
.007
)(0
.008
)(0
.008
)(0
.008
)(0
.008
)(0
.007
)(0
.008
)I
h 99;t
≠¸
-0.0
100.
001
0.03
4***
0.02
9***
0.03
9***
0.03
4***
0.04
6***
0.03
6***
(0.0
06)
(0.0
06)
(0.0
08)
(0.0
08)
(0.0
08)
(0.0
07)
(0.0
07)
(0.0
07)
Ih 10
0;t≠
¸0.
001
0.00
60.
044*
**0.
051*
**0.
064*
**0.
059*
**0.
097*
**0.
070*
**(0
.007
)(0
.007
)(0
.008
)(0
.009
)(0
.008
)(0
.009
)(0
.008
)(0
.008
)F
ixed-e�
ects:
Mun
icip
ality
No
No
No
No
No
No
No
No
N.c
hild
ren
No
No
No
No
No
No
No
No
Age
No
No
No
No
No
No
No
No
Obs
erva
tion
s11
,062
11,0
6211
,062
11,0
6211
,062
11,0
6211
,062
11,0
62R
-squ
ared
0.00
60.
006
0.01
20.
012
0.02
00.
018
0.02
90.
019
Notes:
Res
ults
arep
rese
nted
fora
sele
ctio
nof
Ih j,n
amel
yfo
rthe
med
ian
and
thet
opth
reep
erce
ntile
grou
ps.
Estim
ates
for
di�e
rent
valu
esof
jca
nbe
obta
ined
upon
requ
est
from
the
auth
ors.
Onl
yco
uple
sin
whi
chbo
thsp
ouse
sw
ere
resid
ent
inLo
mba
rdy
for
thre
eye
ars
befo
reth
ew
eddi
ngar
eco
nsid
ered
.St
anda
rder
rors
inpa
rent
hese
s;**
*p<
0.01
,**
p<0.
05,*
p<0.
1.
27
Tabl
e6:
OLS
estim
ates
ofth
epr
obab
ility
ofa
wom
anbe
long
ing
toth
eto
p1%
Iw 100
gett
ing
mar
ried
toa
man
belo
ngin
gto
perc
entil
eIh j
,for
som
e,se
lect
edj,
inth
eye
arof
the
wed
ding
(t,i
.e.
¸=
0)an
dat
thre
eye
ars
befo
reth
ew
eddi
ng(¸
=3)
,whe
ret
is20
11.
Iw 10
0;t≠
¸I
w 100;
t≠¸
Iw 10
0;t≠
¸I
w 100;
t≠¸
Iw 10
0;t≠
¸I
w 100;
t≠¸
Iw 10
0;t≠
¸I
w 100;
t≠¸
¸=
0¸
=0
¸=
1¸
=1
¸=
2¸
=2
¸=
3¸
=3
. . .. . .
. . .. . .
. . .. . .
. . .. . .
. . .I
h 50;t
≠¸
-0.0
13-0
.000
-0.0
030.
003
-0.0
10-0
.007
-0.0
11-0
.011
(0.0
16)
(0.0
18)
(0.0
13)
(0.0
14)
(0.0
12)
(0.0
15)
(0.0
13)
(0.0
16)
. . .. . .
. . .. . .
. . .. . .
. . .. . .
. . .I
h 98;t
≠¸
0.01
4*0.
010
0.02
2***
0.02
0***
0.01
00.
005
0.00
80.
004
(0.0
07)
(0.0
08)
(0.0
07)
(0.0
08)
(0.0
07)
(0.0
08)
(0.0
08)
(0.0
09)
Ih 99
;t≠
¸0.
046*
**0.
051*
**0.
035*
**0.
030*
**0.
041*
**0.
030*
**0.
036*
**0.
036*
**(0
.007
)(0
.008
)(0
.007
)(0
.008
)(0
.007
)(0
.008
)(0
.007
)(0
.008
)I
h 100;
t≠¸
0.09
7***
0.09
0***
0.08
2***
0.06
6***
0.09
4***
0.08
6***
0.07
0***
0.05
7***
(0.0
08)
(0.0
08)
(0.0
08)
(0.0
08)
(0.0
08)
(0.0
09)
(0.0
08)
(0.0
09)
Fixed-e�
ects
Mun
icip
ality
No
Yes
No
Yes
No
Yes
No
Yes
N.c
hild
ren
No
Yes
No
Yes
No
Yes
No
Yes
Age
No
Yes
No
Yes
No
Yes
No
Yes
Obs
erva
tion
s11
,062
11,0
6211
,062
11,0
6211
,062
11,0
6211
,062
11,0
62R
-squ
ared
0.02
90.
200
0.02
50.
197
0.02
50.
165
0.01
90.
177
Notes:
Res
ults
are
pres
ente
dfo
ra
sele
ctio
nof
Iw kan
dIh j
,nam
ely
for
the
med
ian
and
the
top
thre
epe
rcen
tile
grou
ps.
Estim
ates
for
di�e
rent
valu
esof
kan
dj
can
beob
tain
edup
onre
ques
tfr
omth
eau
thor
s.St
anda
rder
rors
inpa
rent
hese
s;**
*p<
0.01
,**
p<0.
05,*
p<0.
1.
28
Tabl
e7:
OLS
estim
ates
ofth
epr
obab
ility
ofa
wom
anbe
long
ing
tope
rcen
tile
grou
pIw k
gett
ing
mar
ried
toa
man
belo
ngin
gto
perc
entil
egr
oup
Ih j,f
orso
me,
sele
cted
kan
dj,
inth
eye
arof
the
wed
ding
(t,i
.e.
¸=
0)an
dat
thre
eye
ars
befo
reth
ew
eddi
ng(¸
=3)
,whe
ret
is20
11.
Inco
me
from
both
empl
oym
ent
and
self-
empl
oym
ent
(labo
urin
com
e).
Iw 50
;t≠
¸I
w 50;t
≠¸
Iw 98
;t≠
¸I
w 98;t
≠¸
Iw 99
;t≠
¸I
w 99;t
≠¸
Iw 10
0;t≠
¸I
w 100;
t≠¸
¸=
0¸
=3
¸=
0¸
=3
¸=
0¸
=3
¸=
0¸
=3
. . .. . .
. . .. . .
. . .. . .
. . .. . .
. . .I
h 50;t
≠¸
-0.0
02-0
.004
0.00
1-0
.023
0.00
4-0
.015
-0.0
17-0
.021
(0.0
11)
(0.0
13)
(0.0
13)
(0.0
15)
(0.0
14)
(0.0
15)
(0.0
12)
(0.0
13)
. . .. . .
. . .. . .
. . .. . .
. . .. . .
. . .I
h 98;t
≠¸
-0.0
01-0
.002
0.03
6***
0.00
50.
028*
**0.
049*
**0.
013*
0.01
0(0
.007
)(0
.009
)(0
.008
)(0
.010
)(0
.009
)(0
.010
)(0
.008
)(0
.009
)I
h 99;t
≠¸
-0.0
12*
-0.0
050.
049*
**0.
041*
**0.
042*
**0.
056*
**0.
042*
**0.
049*
**(0
.007
)(0
.009
)(0
.008
)(0
.010
)(0
.009
)(0
.010
)(0
.008
)(0
.009
)I
h 100;
t≠¸
-0.0
08-0
.001
0.02
9***
0.04
8***
0.07
6***
0.05
2***
0.06
8***
0.02
5***
(0.0
07)
(0.0
09)
(0.0
08)
(0.0
11)
(0.0
09)
(0.0
10)
(0.0
08)
(0.0
09)
Con
trols:
Mun
icip
ality
No
No
No
No
No
No
No
No
N.c
hild
ren
No
No
No
No
No
No
No
No
Age
No
No
No
No
No
No
No
No
Obs
erva
tion
s11
,062
11,0
6211
,062
11,0
6211
,062
11,0
6211
,062
11,0
62R
-squ
ared
0.01
10.
259
0.01
40.
166
0.02
50.
183
0.02
40.
165
Notes:
Res
ults
are
pres
ente
dfo
ra
sele
ctio
nof
Iw kan
dIh j
,nam
ely
for
the
med
ian
and
the
top
thre
epe
rcen
tile
grou
ps.
Estim
ates
for
di�e
rent
valu
esof
kan
dj
can
beob
tain
edup
onre
ques
tfr
omth
eau
thor
s.O
nly
coup
les
inw
hich
both
spou
ses
wer
ere
siden
tin
Lom
bard
yfo
rth
ree
year
sbe
fore
the
wed
ding
are
cons
ider
ed.
Stan
dard
erro
rsin
pare
nthe
ses;
***
p<0.
01,*
*p<
0.05
,*p<
0.1.
29
Table 8: OLS estimates of the probability of a woman belonging to percentile groupIw
100 getting married to a man belonging to percentile Ihj , for some, selected k and
j, at time t ≠ 3, where t is 2011, controlling for only wives’ and for both spouses’observable characteristics and both spouses’ individual fixed e�ects. Robustness checksby di�erent age groups.
Age 20-39 Age 30-49 Age 20-59
I
w100;t≠¸ I
w100;t≠¸ I
w100;t≠¸ I
w100;t≠¸ I
w100;t≠¸ I
w100;t≠¸
¸ = 0 ¸ = 3 ¸ = 0 ¸ = 3 ¸ = 0 ¸ = 3
......
......
......
...I
w50;t≠3 -0.013 -0.006 -0.007 -0.056* -0.001 0.035**
(0.023) (0.014) (0.030) (0.030) (0.013) (0.015)...
......
......
......
I
w98;t≠3 0.039*** 0.010 0.005 0.005 0.006 0.019***
(0.008) (0.008) (0.013) (0.012) (0.006) (0.007)I
w99;t≠3 0.030*** 0.035*** 0.045*** 0.042*** 0.042*** 0.036***
(0.008) (0.008) (0.013) (0.012) (0.006) (0.006)I
w100;t≠3 0.105*** 0.078*** 0.085*** 0.052*** 0.092*** 0.092***
(0.009) (0.008) (0.012) (0.013) (0.007) (0.007)Controls:Municipality Yes Yes Yes Yes Yes YesN. children Yes Yes Yes Yes Yes YesAge Yes Yes Yes Yes Yes Yes
Observations 11,687 11,687 6,086 6,086 16,344 16,344R-squared 0.189 0.161 0.215 0.213 0.148 0.135
Notes: Results are presented for a selection of Ihj , namely for the median and the
top three percentile groups. Estimates for di�erent values of j can be obtained uponrequest from the authors. Only couples in which both spouses were resident in Lom-bardy for three years before the wedding are considered. Standard errors in paren-theses; *** p<0.01, ** p<0.05, * p<0.1.
30
6 E�ects of assortative mating on inequality
Finally, we address the question of whether or not earning assortative mating
has an e�ect on income inequality. Greenwood, Guner, Kocharkov, and Santos
(2014) Eika, Mogstad, and Zafar (2014) and Hryshko, Juhn, and McCue (2017b)
showed that educational assortative mating increases household income inequal-
ity. Here we assess by how much income assortative mating a�ects household
income inequality measures and income shares, which are of particular interest
given the highly positive assortative mating at top income levels.
We focus on new couples in 2011 where both spouses are aged between 25
and 44 and at least one of them earns employment income, although the results
are broadly consistent between di�erent age groups and with those including
of self-employment income. The focus on newly-weds is because they repre-
sent a very small proportion of the total population (about 0.6% in each year)
and even large e�ects of assortative mating in one year would have a negligible
e�ect on the concentration of income in the overall distribution. These mea-
sures are provided for individual incomes, assuming income is shared equally in
each couple. Actual income inequality measures and shares are compared with
counterfactual measures obtained by randomly generating couples in which each
husband is matched with a randomly chosen wife, assuming that labour supply
and income are exogenous to household formation. This random allocation of
husbands to wives is repeated 1, 000 times, generating a counterfactual distri-
bution of randomly allocated couples.
Figure 5 shows results in which random mating income inequality indices and
share distributions are contrasted with actual figures, which are represented by
a vertical red line. The distributions of random mating inequality indices are
roughly symmetric and highly concentrated around the mean, showing that,
under random assortative mating, inequality would be about 10% lower.
31
The size of the assortative mating e�ect on income inequality seems to be
in line with, for example, that found by Eika, Mogstad, and Zafar (2014) and
well below 10%. In particular, administrative data show that the reason why
random assortative mating does not change the top income share is that a top-
earning spouse (often the husband in Italy) is su�cient to place the household
in the top income percentile group. Our data allow us to go deeper with the
counterfactual analysis and to show that the picture changes if one focuses on
household (employment) income when:
(a) both spouses belong to the top 1% of their gender income distributions;
(b) both spouses belong to the next 4% of their gender income distributions
(i.e. they both have incomes between the 95th and 99th percentile);
(c) both spouses belong to the next 5% of their gender income distributions
(i.e. they both have incomes between the 90th and 95th percentile);
(d) both spouses belong to the next 40% of their gender income distributions
(i.e. they both have incomes above the median and below the 90th per-
centile).
To facilitate interpretation, we normalised each income share by dividing it by
the random probability of falling into each group, i.e. (1/100 ◊ 1/100) for top
1%, (5/100◊5/100to1/100◊1/100) for the next 4%, (10/100◊10/100to5/100◊
5/100) for the next 5%, and (50/100 ◊ 50/100to10/100 ◊ 10/100) for the next
40%.
Figure 6 shows the results of plotting the distribution of 1, 000 di�erent
random mating simulations and the actual value depicted as a red vertical line.
Table 9 complements figures by showing the pointwise estimates. Panel (a)
shows that the actual employment income share of couples in which both spouses
are in their respective top income distribution is 61.4 larger than random. Had
32
Figure 5: Gini inequality index assuming income is shared equally in a couple.Actual value (blue vertical line) and histogram distribution of randomly allo-cated spouses (histograms, with 1, 000 di�erent random allocations of spouses,and mean value shown by a red line). Year 2011, both spouses aged 25-44.
����
���
���
���
���
��� ��� ���
&RXQWHUI��GLVWULEXWLRQ &RXQWHUI��PHDQ $FWXDO
*LQL
the marriage been independent of income, the average share would have been
much lower, around 11.3, i.e. over 80% lower, although with a large variance.
Panel (b) shows that, for couples with both spouses in the next 4%, the actual
employment income share is 16.2 times larger than would be expected with a
random distribution of income. The average under random mating would be
about one third of this.
33
Figure 6: Actual values (blue vertical line) and histogram distribution of ran-domly allocated spouses (histograms, with 1, 000 di�erent random allocationsof spouses, and mean value shown by a red line), normalised by the total prob-ability under uniform distribution. Year 2011, both spouses aged 25-44, 1, 000random allocations of spouses.
�������������
��
� ����� ���� ����� ���
%RWK�VSRXVHV�LQ�WRS����D��7RS����FRXSOHV
���
����
� ��� ���� ���� ����
%RWK�VSRXVHV�LQ���WK�����WK�SHUFHQWLOH��E��1H[W����FRXSOHV
���
����
�
� ��� ��� ��� ���
%RWK�VSRXVHV�LQ���WK�����WK�SHUFHQWLOH��F��1H[W����FRXSOHV
��
����
��
��� ��� ��� ��� �
%RWK�VSRXVHV�LQ���WK�����WK�SHUFHQWLOH��G��1H[W�����FRXSOHV
&RXQWHUI��GLVWULEXWLRQ &RXQWHUI��PHDQ $FWXDO
34
Table 9: Income shares with respect to random distribution, with actual andrandomly allocated couples in 1, 000 random matching for some selected incomegroups. Both spouses aged 25-44.
(a) Inequality
Actual RandomAssortative Mating
Mating mean st. dev.
Gini 0.345 0.322 0.001
(b) Top income shares
Actual RandomAssortative Mating
Mating mean st. dev.
Both spouses in top 1% 61.418 11.285 15.743Both spouses in 95th-99th percentile 16.235 6.128 1.491Both spouses in 90th-95th percentile 5.456 3.377 0.479Both spouses in 50th-90th percentile 1.655 1.828 0.033
Notes: Statistics computed based on the residents who got married in 2011:63, 734 individuals (31, 867 couples). Top-1% couples are those in which bothspouses belong to the top 1% of their respective gender income distribution; inthe next 4% of couples, both spouses have incomes between the 95th and 99th
percentiles; in the next 5% of couples, both spouses have incomes between the90th and 95th percentiles and, in the next 40% of couples, both spouses haveincomes above the median and below the 90th percentile. To facilitate inter-pretation, we divided each share by the probability of falling into each groupunder a uniform distribution, i.e. (1/100◊ 1/100) for the top 1%, (5/100◊5/100 to 1/100◊ 1/100) for the next 4%, (10/100◊ 10/100 to 5/100◊ 5/100)for the next 5%, and (50/100◊ 50/100 to 10/100◊ 10/100) for the next 40%.The random mating distribution is obtained by matching each woman to a ran-domly chosen man and replicating the process 1, 000 times, hence computingmean and standard deviation.
35
7 Concluding comments
In this paper, we exploited a novel dataset to assess income assortative mating
among newly-weds, which has seldom been analysed in the literature mostly
because datasets recording individual incomes typically do not measure the in-
comes of both spouses before marriage and therefore do not allow one to gauge
the extent of endogeneity caused by simultaneity. Thanks to Italian legislation
allowing us to identify new couples using tax records and taking advantage of
the longitudinal dimension of the data, which allows us to account for the si-
multaneity of the spouses’ incomes, we found that positive income assortative
mating is large and stable, especially at the top of the income distribution,
whereas there is no sign of significantly positive assortative mating for most of
the rest of the distribution. Our analysis cannot claim to determine causality,
although we controlled for a detailed set of municipality variables, as well as
other observable characteristics of both spouses, and assessed assortative mat-
ing at three years before the wedding, when the spouses most likely were not
promised to each other in marriage and, in some cases, had not even met each
other. Our results show that assortative mating is large at the top levels of in-
come, even when controlling for employment income, which is well known to be
relatively more equally distributed than capital income and wealth, for which,
unfortunately, we have no data. A similar approach can be used for countries
where administrative data and suitable institutional settings are available.
There might be several reasons why this happens. It is not surprising that
most people want to marry someone whose background, lifestyle, cultural, reli-
gious views and income are similar to their own. Assortative mating can be a
cause for celebration, particularly if shared backgrounds and interests result in
stronger and more stable relationships. However, our analysis of the e�ect of
income assortative mating on inequality suggests that it a�ects the distribution
36
of resources across the population and therefore might be a cause for concern.
For instance, it might exacerbate the problem of low social mobility and the
intergenerational transmission of adverse conditions. For instance, regarding
family formation, less well-o� families are likely to provide lower levels of edu-
cation to their o�spring and less educated people are much more likely to have
a child before marrying. Indeed, recent literature (e.g. Chetty, Hendren, Kline,
and Saez, 2014) shows that a child born in the bottom income quintile group
has a higher chance of remaining in that quintile as an adult and parents in the
bottom income quintile group are more likely than middle-income parents to
score among the weakest parents.
37
8 Appendix
8.1 The EMR over all percentiles
Figure 7 shows the three-dimensional profile of EMR for all percentile groups.
It shows that couples with no employment income are relatively less likely to
get married than couples with some income. Interestingly, it seems that the fre-
quency of top-income husbands getting married to zero-income women increases
slightly over time.
38
Year
2008
Year
2009
Year
2010
Year
2011
Figu
re7:
Unc
ondi
tiona
ljoi
ntin
com
edi
strib
utio
nby
perc
entil
egr
oup
ofea
chsp
ouse
,em
ploy
men
tin
com
eon
ly,
age
grou
p25
-44,
2008
-201
1.
39
8.2 Estimation of the average homogamy coe�cient
The sociological literature normally uses the term "homogamy" as synonymous
with assortative mating and describes changes in mating patterns using log-
linear models for contingency tables according to certain characteristics (e.g.
educational attainment or income category) (Agresti, 2002). These models focus
on estimates of the average association between couples’ characteristics, namely
the diagonal in the contingency table, controlling for shifts in the marginal
distributions.
This literature (e.g. Schwartz, 2010) usually estimates a model of average
assortative mating, which is fully saturated with the cross interactions between
the husband’s and wife’s characteristics. In the case of assortative mating based
on income, the model would be the following:
log(ctkj) = ⁄ +
ÿ
k
⁄
wk I
wk;t +
ÿ
j
⁄
hj I
hj;t +
ÿ
kj
⁄k,jI
wk;t · I
hj;t + ⁄HIH;t + ‘kj;t (5)
where, consistent with the notation used in Section 4, c
tkj is the number of
newly wed couples with a husband in percentile group j and a wife in percentile
group k in year t; I
sk;t is a binary indicator that takes a value equal to one if
the spouse s = w, h belongs to group {k}s, for k = 1, ..., 100; IH;t is a binary
indicator, which takes a value equal to one if the newly wed spouses belong to
the same percentile group; ⁄s are coe�cients to be estimated; and ‘kj;t is an
error term. The model (5) is estimated using a Poisson regression (e.g. Cameron
and Trivedi, 2013), and the homogamy coe�cient is ⁄H , which is the average
assortative mating along the main diagonal of the contingency table.
By taking into account the endogeneity resulting from simultaneity, as in
Section 4, and using the income percentiles of ¸ period earlier, we can estimate
the following:
40
log(ctkj) = ⁄+
ÿ
k
⁄
wk I
wk;t≠¸+
ÿ
j
⁄
hj I
hj;t≠¸+
ÿ
kj
⁄k,jI
wk;t≠¸ ·Ih
j;t+⁄HIH;t≠¸+‘kj;t≠¸
(6)
⁄H is a "homogamy term" added to estimate the average e�ect of belonging to
the same percentile group, which means lying on the diagonal of the contingency
table for both the spouses.
For computational reasons, instead of using percentile groups, we have used
ventile groups, i.e. equally frequent groups with one twentieth of the total
population in each group. In Table 10, we show the estimation of the probability
of having both newly wed spouses in the same income ventile for models 5 and
6 in the first and second column. It shows a significant homogamy probability
of lying on the diagonal with both ¸ = 0 and ¸ = 3, suggesting that newly-weds
are twice as likely to belong to the same income ventile as to belong to di�erent
ones and that the homogamy probability does not change significantly even if
simultaneity is accounted for.
Table 10: Odds ratio of models (5) and (6) estimated with Poisson models
log(ctkj) log(ct
kj)¸ = 0 ¸ = 3
Homogamy coe�cient ⁄H 1.929*** 1.961***(0.080) (0.140)
Observations 400 400
Notes: Only the homogamy coe�cient ⁄H is shown. The regressions includebut do not show all other coe�cients in the models.Standard errors in parentheses; *** p<0.01, ** p<0.05, * p<0.1.
41
References
Agresti, A. (2002): Categorical Data Analysis. John Wiley and Sons.
Arrondel, L., and N. Frémeaux (2016): “‘For Richer, For Poorer’: Assor-tative Mating and Savings Preferences,” Economica, 83(331), 518–543.
Atkinson, A. B. (2005): “Top incomes in the UK over the 20th century,” Jour-nal of the Royal Statistical Society: Series A (Statistics in Society), 168(2),325–343.
Atkinson, A. B., T. Piketty, and E. Saez (2011): “Top Incomes in theLong Run of History,” Journal of Economic Literature, 49(1), 3–71.
Autor, D. H. (2014): “Skills, Education, and the Rise of Earnings InequalityAmong the “Other 99 Percent”,” Science, 344(6186), 843–851.
Barban, N., E. De Cao, S. Oreffice, and C. Quintana-Domeque (2016):“Assortative Mating on Education: A Genetic Assessment,” Discussion paper,University of Oxford, Department of Economics Economics Series WorkingPapers.
Becker, G. (1973): “A Theory of Marriage: Part I,” Journal of PoliticalEconomy, 81(4), 813–46.
Bertrand, M., E. Kamenica, and J. Pan (2015): “Gender Identity andRelative Income within Households,” The Quarterly Journal of Economics,130(2), 571–614.
Bursztyn, L., T. Fujiwara, and A. Pallais (2017): “’Acting Wife’: Mar-riage Market Incentives and Labor Market Investments,” American EconomicReview, 107(11), 3288–3319.
Cameron, A., and P. Trivedi (2013): Regression Analysis of Count Data.Cambridge University Press.
Chetty, R., N. Hendren, P. Kline, and E. Saez (2014): “Where is theland of Opportunity? The Geography of Intergenerational Mobility in theUnited States,” The Quarterly Journal of Economics, 129(4), 1553–1623.
Chiappori, P.-A., and B. Salanié (2016): “The Econometrics of MatchingModels,” Journal of Economic Literature, 54(3), 832–61.
Cowell, F. (2011): Measuring Inequality. Oxford University Press.
Cowell, F. A., and M. P. Victoria-Feser (1996): “Robustness Propertiesof Inequality Measures,” Econometrica, 64, 77–101.
Daly, M. C., and R. G. Valletta (2006): “Inequality and Poverty in theUnited States: The E�ects of Rising Wage Dispersion of Men’s Earnings andChanging Family Behaviour,” Economica, 73, 75–98.
42
Eika, L., M. Mogstad, and B. Zafar (2014): “Educational AssortativeMating and Household Income Inequality,” NBER Working Papers 20271,National Bureau of Economic Research, Inc.
Fiorio, C. V. (2011): “Understanding Italian Inequality Trends,” Oxford Bul-letin of Economics and Statistics, 73(2), 255–275.
Frémeaux, N., and A. Lefranc (2017): “Assortative mating and earningsinequality in France,” Discussion Paper 11084, IZA DP.
Goldin, C., and L. F. Katz (2007): “The Race between Education andTechnology: The Evolution of U.S. Educational Wage Di�erentials, 1890 to2005,” NBER Working Papers 12984, National Bureau of Economic Research,Inc.
Gonalons-Pons, P., and C. R. Schwartz (2017): “Trends in EconomicHomogamy: Changes in Assortative Mating or the Division of Labor in Mar-riage?,” Demography, 54(3), 985–1005.
Greenwood, J., N. Guner, G. Kocharkov, and C. Santos (2014):“Marry Your Like: Assortative Mating and Income Inequality,” AmericanEconomic Review, 104(5), 348–53.
Hryshko, D., C. Juhn, and K. McCue (2017a): “Trends in earnings inequal-ity and earnings instability among U.S. couples: How important is assortativematching?,” Labour Economics, 48(Supplement C), 168 – 182.
(2017b): “Trends in earnings inequality and earnings instability amongU.S. couples: How important is assortative matching?,” Labour Economics,48(Supplement C), 168 – 182.
Katz, L., and D. Autor (1999): “Changes in the wage structure and earn-ings inequality,” in Handbook of Labor Economics, ed. by O. Ashenfelter, andD. Card, vol. 3, Part A, chap. 26, pp. 1463–1555. Elsevier, 1 edn.
Larrimore, J. (2014): “Accounting for United States Household Income In-equality Trends: The Changing Importance of Household Structure and Maleand Female Labor Earnings Inequality,” Review of Income and Wealth, 60(4),683–701.
Mare, R. D. (1991): “Five Decades of Educational Assortative Mating,” Amer-ican Sociological Review, 56(1), 15–32.
Pestel, N. (2017): “Marital Sorting, Inequality and the Role of Female LabourSupply: Evidence from East and West Germany,” Economica, 84(333), 104–127.
Piketty, T. (2001): Les Hauts revenus en France au 20e siècle : inégalitès etredistribution, 1901-1998. Grasset.
43
Piketty, T., and E. Saez (2003): “Income Inequality in the United States,1913-1998.,” Quarterly Journal of Economics, 118(1), 1.
Schwartz, C. (2010): “Earnings Inequality and the Changing Associationbetween Spouses’ Earnings,” American Journal of Sociology, 115(5), 1524–1557.
Schwartz, C. R., and R. D. Mare (2005): “Trends in educational assortativemarriage from 1940 to 2003,” Demography, 42(4), 621–646.
44