understanding variation in graduate earnings using tax data › sites › default › files ›...
TRANSCRIPT
Understanding variation in graduate earnings using tax data
Anna Vignoles University of Cambridge
LEO published results due soon…
This Photo by Unknown Author is licensed under CC BY-NC-SA
Outline
• Graduate wage variation in SLC-HESA-HMRC linked data – Britton, Dearden, Shephard, Vignoles – Includes employed and self-employed workers
• LEO data – what we can do methodologically – Belfield, Britton, Buscha, Dearden, Dickson, van
der Erve, Sibieta, Vignoles, Zhu – Richer controls for educational background
• How can we really use these data for policy?
Data access
• Illustration of power of linked admin data • Exemplar of use of such data for policy
focused research – SLC-HESA – HMRC data accessed in HMRC lab – LEO accessed under contract to some researchers
• In the future huge scope for other work using these data – need access for wider research community
HMRC Disclaimer
• HM Revenue & Customs (HMRC) agrees that the figures and descriptions of results in the attached document may be published. This does not imply HMRC's acceptance of the validity of the methods used to obtain these figures, or of any analysis of the results.
• Copyright of the statistical results may not be assigned. This work contains statistical data from HMRC which is Crown Copyright. The research datasets used may not exactly reproduce HMRC aggregates. The use of HMRC statistical data in this work does not imply the endorsement of HMRC in relation to the interpretation or analysis of the information.
SLC Disclaimer • The Student Loans Company (SLC) agrees that the figures and
descriptions of results in the attached document may be published. This does not imply SLC’s acceptance of the validity of the methods used to obtain these figures, or of any analysis of the results.
• Copyright of the statistical results may not be assigned. This work contains statistical data from SLC which is protected by Copyright, the ownership of which is retained by SLC. The research datasets used may not exactly reproduce SLC aggregates.
• The use of SLC statistical data in this work does not imply the endorsement of SLC in relation to the interpretation or analysis of the information.
GRADUATE WAGE VARIATION IN HESA-HMRC LINKED DATA
Motivation • Relative graduate earnings have remained
high despite expansion in student numbers • But variation in graduate outcomes has
increased • What is the extent of inequality in graduate
earnings: • by institution? • by subject? • by socio economic background?
Previous literature • Literature on causal impact of education on
individuals' earnings – Blundell et al. (2005), Becker (1962), Card (1999, 2012)
• Empirical work for UK, including returns by degree subject (unable to consider institution)
– Blundell et al. (2005), Bratti et al. (2005), Chevalier (2011), Hussain et al. (2009), Sloane and O'Leary (2005), Smith and Naylor (2001), Walker and Zhu (2011, 2013)
• US evidence of heterogeneity in returns by college and major
– Monks (2000), Arcidiacono (2004)
• Growing literature on use of administrative data – Black et al. (2005), Bhuller et al. (2011), Carneiro et al. (2013),
Chetty (2014)
Data • Individuals domiciled in England who received
loans from the Student Loans Company (SLC) – Loan take up 85-90%
• Merging data – Income tax data from HMRC – Borrowing records from the SLC – HESA course level data
Data • 2.6 million students who borrowed from the SLC between
1998 and 2010 (& are “in repayment”) • Gender • Higher Education Institution – last known
– Institutions with 1000+ loans are included individually - there are 170
– Other institutions are grouped together in an `other' category • Cohort (i.e. first year of study) • Subject studied (first letter of Jacs code in 85% of cases and
100% of subject groups) • Amount borrowed from SLC, and voluntary repayments • NI (encrypted)
Sample • HMRC use a 10% random sub sample selected
via NI number • All self-assessment (SA) records from 2002/03
to 2012/13 tax years – 260,000 students who borrowed between 1998
and 2010 – PAYE and Self Assessment records from 2002/03
to 2012/13 tax years – Focus on 2008/9-2012/13
Data quality • HMRC data high quality measure of earnings, at least above
the minimum earnings threshold – If we have both PAYE and SA data, we use SA data. – Treat HMRC data as more reliable than SLC data (e.g. Gender).
• Earnings from labour – In come from employment, partnership, self employment
• We exclude: – Foreign income – Income from dividends, capital gains, pensions or inheritance. – Pension contributions. Undesirable.
• No HMRC record: we input earnings as 0 – Excludes those working abroad
Data
• Individual institution effects and can name most Russell Group providers
• HESA data to enable us to compare similar institutions/courses – Average HESA tariff – Ethnicity – POLAR – % living at home – % privately educated – Mean parental occupational class
Institutions
Data
• Measure of parental income • Individuals borrowing the maximum amount
available to wealthier households • Identifies top fifth of households of those
applying to HE
Model
• Y – earnings Z – conditioning variables, incl aggregate HESA variables
• Quantile regression
Caveats
• Allow for average differences in student intake not individual ability – motivation for LEO
• Not necessarily causal • Will include drop outs – can test whether this
matters in LEO • Understate earnings of those moving abroad
What did we find? • Graduates are much more likely to be in work, and
earn considerably more than non-graduates.
• Non-graduates were twice as likely to have no earnings as graduates ten years on (30% against 15% for the cohort commencing their studies in 1999 and observed in 2011/12).
• Average (median) earnings: – male graduates £30,000 non-graduates £22,000 – Female graduates £27,000 non-graduates £18,000
Britton et al. 2015 http://www.ifs.org.uk/publications/7997
What did we find?
• Big differences in earnings according to which university was attended and subject studied
• Not entirely driven by differences in entry
requirements
• Creative arts, economics and medicine remain outliers
What did we find?
• Big socio-economic gap in earnings of graduates – 30% for males and 24% for females on average
• Taking account of subject and institution – Gap remains around 10%
RAW DIFFERENCES
TRIES TO ALLOW FOR STUDENT INTAKE
TRIES TO ALLOW FOR STUDENT INTAKE
RAW DIFFERENCES
RAW DIFFERENCES
RAW DIFFERENCES
LEO DATA – WHAT WE CAN DO METHODOLOGICALLY
What can LEO do?
• Richer prior education achievement controls – National Pupil Database - school type, test scores
including for independent school pupils – HESA – individual level data on entry
qualifications, degree achievement etc. • Crucially have prior education controls for the
non graduate sample so can estimate absolute returns
• Model the impact of postgraduate study
What can LEO do?
• Even more scale – more years, more cohorts – model life course differentials
• Methodology – Inverse Probability Weighted Regression
Adjustment – Returns are estimated at 3 levels: Subject,
Institution and Course
Methodology
• Two models • University*subject fixed effects (course) • Time trends for subject, HEI and course
What can LEO do?
• Control for rich observables • Still subject to bias from unobservable
selection – Could merge in UCAS data and compare those
making similar choices to address this
This Photo by Unknown Author is licensed under CC BY-SA
HOW CAN WE REALLY USE THESE DATA FOR POLICY?
What do these results really mean?
• Theory matters…. – Human capital theory – Signalling theory
• Implications for public policy • Implications for students
Implications for public policy
• These data can certainly tell us where public subsidy is going or likely to go – Graduates who study subjects such as creative arts – If the numbers taking these subjects increase this may
bring down the aggregate graduate earnings premium • Even if our estimates are causal they cannot tell
you where it should go – Estimating private returns – Ignoring externalities, non monetary returns social or
otherwise • Inform debate about who we want to subsidise
Implications for public policy
• Employment outcomes already published at institutional level – Key information set (KIS) – Short term employment rates misleading indicators ?
• These data might/will be used for the TEF but some significant issues… – Causality – hence getting closer with LEO – Data over a longer period to get stability – Data relate to the past and may not guide the future – Only useful as part of a wider set of measures – Unfortunate incentives…..
Implications for students
• How might data be used by universities or government? – Reduce fee cap for low value subjects? – What if students are uninformed? They may
choose low cost subjects… – What if students are poorly prepared for some
subjects? They may still choose low cost subjects… – Cross subsidy within institutions so unclear what
real costs of provision are and hence fee caps may not operate as intended
Implications for students
• A degree offers a pathway to relatively high earnings for many but not all graduates – Do students have a right to know what others
have gone on to do from a particular degree? – Does data on the past help them?
• Poor students may need additional support to realise the full potential value of a degree? – Advice and guidance? – Postgraduate study?
Bibliography • Atkinson, Anthony B., Thomas Piketty and Emmanuel Saez (2011). Top Incomes in the Long Run of History, Journal
of Economic Literature, 49(1), pp. 3-71. • Belley, P. and Lochner, L. (2007) ‘The Changing Role of Family Income and Ability in Determining Educational
Achievement’, Journal of Human Capital, 1(1): 37 – 89. • Black, D. (2006) ‘Estimating the Returns to College Quality with Multiple Proxies for Quality’, Journal of Labor
Economics, 24(3): 701 – 728. • Blanden, J. and Machin, S. (2010) Intergenerational Inequality in Early Years Assessments. In Children of the 21st
century: The first five years (eds K. Hansen, H. Joshi, and S.Dex), pp. 153-168. Bristol: The Policy Press. • Britton, Jack, Neil Shephard, and Anna Vignoles. Comparing sample survey measures of English earnings of
graduates with administrative data during the Great Recession. No. W15/28. Institute for Fiscal Studies, 2015. • Boliver, V. (2013) ‘How fair is access to more prestigious UK universities?’ British Journal of Sociology 64 (2): 344-
364. • Chevalier, A. and Conlon, G. (2003), ‘Does it Pay to Attend a Prestigious University?’, Centre for the Economics of
Education (CEE) Discussion Paper number 33. Accessed 18th October 2012 from http://cee.lse.ac.uk/ceedps/ceedp22.pdf
• Crawford, C., Macmillan, L. and Vignoles, A. (2014) Progress made by high-attaining children from disadvantaged backgrounds, Social Mobility and Child Poverty Commission Research report, June 2014.
• Cunha, F., Heckman, J. and Lochner, L. (2006) ‘Interpreting the Evidence on Life Cycle Skill Formation’. In Eric Hanushek and Finis Welch (eds) Handbook of the Economics of Education, Amsterdam: Holland North.
45
Bibliography • Chowdry, H., Crawford, C., Dearden, L., Goodman, A., & Vignoles, A. (2013). Widening participation in higher
education: Analysis using linked administrative data. Journal of the Royal Statistical Society: Series A (Statistics in Society), 176, 431–457.
• Ermisch, J. and Del Bono, E. (2012) ‘Inequality in Achievements during Adolescence’ in J. Ermisch, M. Jantti and T. Smeeding (eds) Inequality from Childhood to Adulthood: A Cross-National Perspective on the Transmission of Advantage, New York: Russell Sage Foundation.
• Feinstein, L. (2003) Inequality in the Early Cognitive Development of British Children in the 1970 Cohort, Economica, 70, 73-97.
• Goodman A. Sibieta L. and Washbook E. (2009) Inequalities in Educational Outcomes Among Children Aged 3 to 16, Final report for the National Equality Panel, Institute for Fiscal Studies, London. (Available from http://sta.geo.useconnect.co.uk/pdf/Inequalities%20in%20education%20outcomes%20among%20children.pdf.)
• Jerrim, J., & Vignoles, A. (2013). Social mobility, regression to the mean and the cognitive development of high ability children from disadvantaged homes. Journal of the Royal Statistical Society: Series A (Statistics in Society), 176: 887–906.
• Macmillan, L., Tyler, C. and Vignoles, A. (2014) ‘Who gets the top jobs?’, Institute of Education Research Briefing No.89, http://www.ioe.ac.uk/Research_Expertise/RB89_Who_gets_the_top_jobs_MacmillanTylerVignoles.pdf .
• Walker, Ian, and Yu Zhu. "Differences by degree: Evidence of the net financial rates of return to undergraduate study for England and Wales." Economics of Education Review 30, no. 6 (2011): 1177-1186.
46
Sample
Subject group
Share of high income households by HESA tariff group