young lives dfid 05092016

31
Advantages and challenges of collecting and using longitudinal studies for research and policy Marta Favara, Senior Research Officer Paul Dornan, Senior Policy Officer Young Lives, University of Oxford Professional Development Conference 6 th September, 2016

Upload: young-lives-oxford

Post on 11-Jan-2017

89 views

Category:

Data & Analytics


0 download

TRANSCRIPT

PowerPoint Presentation

Advantages and challenges of collecting and using longitudinal studies for research and policy

Marta Favara, Senior Research OfficerPaul Dornan, Senior Policy Officer

Young Lives, University of Oxford

Professional Development Conference

6th September, 2016

1

1

Outline of this presentation

Value of longitudinal (cohort) studies vs. cross sectional data & RCT

Overview of Young Lives study, a unique multi-countries multi cohort longitudinal dataset

Main areas of policy relevant research

In the backstage Processes in place for designing and implementing the survey questionnaireMulti cohorts longitudinal data: Main challenges and risk-mitigation strategies

2

2

Longitudinal cohort studies

Allow to adopt an holistic approach Enhance understanding of how outcomes are shaped:Allows to identify links between earlier circumstances and later (long term) outcomesIdentifying what shapes later well-being; when differences emergeTesting the dynamics of social processes:Enable evaluation of the differing impacts of continuing circumstances (or one-off changes) on later well-being, for example the consequences of chronic poverty

RCT

RCTs can be used to give precise answers to specific questions evaluating the specific changes in well-being attributed to a particular programme but :They can only answer the question posed by the trial. External validity concernsNot able to look at long-term effects (cohort maintenance, costs)

Cross-sectional

RepresentativenessEasier and cheaper to administerUseful for drawing a picture about a specific aspect of the society (e.g. DHS).

Value of longitudinal (cohort) studies vs. cross sectional data & RCT

-So, for example, cross-sectional research can show how many or which households are poor but cannot show whether households remain poor or move in and out of poverty over time (and therefore what are the consequences of chronic poverty); pseudo-panel are not a perfect substitutes (unable to control for time-constant unobservable characteristics)

3

They are not competing methodologies: but rather to employ each to triangulate between methods, and to use one to inform the other (particularly relevant in developing countries). Triangulate between methods

using multi-purpose observational cohort studies, for example, to identify areas worth examining in greater detail with experimental techniques or qualitative researchGain invaluable insights into how the risks and opportunities children encounter along the way can impact on their long-term outcomes.Understanding the problem to design effective policies: what, when, how to intervene

4

What is Young Lives?

5

5

Young Lives: a unique multi-countries multi-cohorts longitudinal study

6

6

Sentinel site sampling; four stages sampling process (region, district/provinces, sentinel sites, random sampling of children within sites); Purposively over-sampled poor areas (40% urban / 60% rural) using different poverty indicators in each countries

Ethiopia India Peru Vietnam Sampling design

7 It was decided that a range of children should be sampled, not only the poorest children, although poor families were over-sampled. Within each cluster, children were randomly selected. In each country, 2,000 children aged between 6 and 18 months were selected to be followed as they grew up over 15 years. This was considered an appropriate number given the duration and scope of the study. It was also considered to be sufficiently large for statistical analysis in general, allowing for the detection of moderate-sized differences between sub-groups of children. A similar sample of 1,000 children per country aged between 7.5 and 8.5 years were selected as an Older Cohort for comparison.7

Longitudinal data covering a period of 15 years from early childhood to adulthood

A life-course approach, very relevant for policy design (early childhood, middle childhood and adolescence)

Cross-cohorts, cross-countries comparisonsCompare two cohorts at the same age (trends, exposure to different policy context) A new generation (Children of YLs children)

Use (panel) siblings data to investigate how household or community circumstances affect child outcomes at the same age; explore intra-household dynamics; controlling for the influence of past events and circumstances

Comprehensive set of information collected at community and household level (caregiver, YL child, a subsample of (younger) siblings and the children of the YL children)

Nice features of YL data

8For major events occur across the whole country or most of the study communities (macroeconomic events, droughts, food price crisis, introduction of new social programmes), the sibling comparison is especially useful as the whole index cohort will experience the same event (i.e. we have no control group).

8

Nutrition & HealthEducation: School Effectiveness; Learning Trajectories and Skills Formation over the Life-cyclePathways to and from Marriage and ParenthoodTransition to the Labour MarketPoverty & Inequality

Main areas for policy relevant research

9

Nutrition & Health

What are the long-run effects of early childhood malnutrition? What are the impacts on thedevelopment of cognitive skills and psycho-social competencies?

What is the incidence, extent, determinants of growth recovery and failure in adolescence?

What is the nature and determinants of maternal malnutrition during the life-cycle and the implications for maternal and child outcomes?

What does predict risky behaviours (smoking, drinking, drugs, criminal behaviours)?

Main areas for policy relevant research (1)

-Stunting and its implications for child development are in general considered irreversible beyond the first 1000 days since conception. However, Young Lives evidence shows that recovery from stunting after the first 1000 days is possible. 45-67% of children who were stunted at age 1 recovered from stunting by age 8, with a significant share of this recovery occurring after age 5.-Catch-up growth between ages one and eight is positively associated with performance in mathematics, reading comprehension, and vocabulary tests.-Higher self-esteem and aspiration at the age 15 reduce the probability to engage in risky behaviour by age 19

10

Education: School Effectiveness

What are the characteristics of effective schools (including teacher, management characteristics; public vs. private schools, language of instruction etc.)What lessons may be learned across contexts concerning school effectiveness and educational policies

Education: Learning Trajectories and Skills Formation over the Life-cycle

To what extent schooling is important in shaping childrens learning and for cognitive skills, non-cognitive and technical skills formation? At which stages of the educational life-course is schooling more or less critical?

At what stages do learning gaps emerge, widen or narrow? Main areas for policy relevant research (2)

-aspirations at age 12 matter for educational achievement at age 19 in Ethiopia- Singh in India: I find a substantial positive effect of private schools on English, no effect on Mathematics for 810-year old students; at 15 years, there are modest effects (b0.2 SD) on Mathematics and Telugu receptive vocabulary.There are substantial learning gaps across countries on standardised international assessments. Clear pattern of stochastic dominance is evident at the age of 5 years, prior to school enrolment, with children in Vietnam at the upper end, children in Ethiopia at the lower, and with Peru and India in between. Differences between country samples grow in magnitude at later ages, preserving the country rankings noted at 5 years of age over the entire age range studied. This divergence is only partly explained by home investments and child-specific endowments;

11

Pathways to and from Marriage and ParenthoodWhat are the (early) social and economic predictors of getting married, cohabiting or having a child during the teen years?

What is the role of parental and childhood expectations and aspirations, as well as gender norms and preferences? How do changes in the labour market affect young peoples relationships and decisions around marriage and parenthood?

What are the social and economic consequences of getting married, cohabiting or having a child during the teen years? (e.g. womens economic participation)What affects the quality of married life and decision-making for married/cohabiting adolescent girls, boys and couples?

Main areas for policy relevant research (3)

Our analysis of the predictors of early marriage identifies groups of women particularly vulnerable to teenage marriage; in particular, we are able to quantify the extent to which the probability of teenage marriage co-varies with social and economic disadvantage as exhibited by (for example) household wealth, parental education, and rurality. This is particularly relevant for the geographical and social targeting of programmes.

PERU: About 1 out of 5 females (and 1 out of 20 for males) in our sample has at least one child by the age of 19 and 80 percent of them are married or cohabiting. Early marriage/cohabitation is indeed intrinsically related to early pregnancy and largely predicted by the same factors. Focusing on females, girls living in poor households and in absence of one of the parents during a prolonged period are at higher risk of early childbearing. Similarly, girls whose self-efficacy and educational aspirations decrease over time are more at risk of becoming a mother during adolescence. Conversely, school attendance and better school performance predict a lower risk of early pregnancy, and \textcolor{blue}{our analysis suggests} this is largely by postponing the first sexual relationship.

INDIA we document that married young women have significantly poorer outcomes at 19 in a range of outcomes subjective wellbeing, psychosocial outcomes, and access to education than their unmarried peers. This disadvantage is not fully removed even upon conditioning on various fixed or pre-determined characteristics and investments or lagged values of the outcomes;12

Transition to the Labour Market

What happens to young women and men when they leave education and enter the labor market at the age of 15 and 22? How many of them are employed (and self-employed), unemployed, inactive and under-unemployed?

How their background and experiences as children shapes their access to the labor market?

What skills facilitate the transition to the labor market and to quality jobs? To what extent education and training are effectively equipping youth with the right skills for the labor market.

To what extent young people realized their childhood aspirations? What role do expectations play?

How is the school-to-work transition of young people related to other parallel key early life transitions, including cohabitation, marriage and childbearing? How young people conciliate paid activities with other responsibilities?

Main areas for policy relevant research (4)

13

Poverty & Inequality

Exploring the links between childhood poverty, the strategies people use to earn their living and the assets available to them, and the implications for childrens long-term life chances.

How do inequalities interact in the ways they impact on childrens development potential?

How do inequalities, including gender inequalities, evolve during early, middle and later childhood?

The impact of transfers and social protection.Main areas for policy relevant research (5)

14

Challenge 1. Cohort maintenanceChallenge 2. Getting comparable measures over timeChallenge 3. Across countries coordination/comparabilityChallenge 4. Ensure high quality dataChallenge 5. Data collection methods: switch to CAPI

Multi cohorts longitudinal data: Main Challenges

15

Challenges :Some attrition is inevitableCohort is relatively small for a longitudinal study Study period is relatively long (three years gap between waves)

Risk mitigating strategies:Collecting detailed contact informationImportance of tracking Maintains continuity of social contact and trust between researchers and familiesReduce refusal rates as much as possible:Importance of explaining what were doing ReciprocityEnsure no respondents are over-loaded (by different elements/sub-studies)Compensations (Losing a day of work has big impact on income)

Challenges: 1. Cohort maintenance & attrition

16

and we have been quite successful!YCOCOverallEthiopia2.2%8.4%4.3%India2.6%4.3%3.2%Peru6.3%10.3%7.3%Vietnam2.9%9.9%5.3%Total3.6%8.1%5.0%

ETHIOPIA

INDIA

PERU

VIETNAM

17

Challenges:The questions need to change as the children grow upKeep as many questions as possible the same across rounds (panel variables)Asking the same questions of the YC as we did the OC in earlier rounds (core base variables)Ensure comparability over time (e.g. cognitive tests-- Item Response Theory)

Limitations for comparability:Switch from PAPI to CAPI; Some changes in the structure of the questionnaire are inevitable`Getting stuck with the errors of the past to the seek of maintain comparability across rounds

Challenges: 2. Getting comparable measures over time

18

Benefits: How patterns of relationships are similar/different across countries.Understanding why and how specific policies or programmes are effective in one country.Comparative analysis can give greater confidence that evidence found in one country is applicable to others.Learning in relation to methods: trying to develop measures that can be used across cultures.Challenges:Constructing a questionnaire that suits different national contexts.Ethical committee approval and country specific sensitivities.Deal with different fieldwork processes.

Risk mitigating strategies :Define research priorities and relevant survey questions in each countryThere are also some country variations Translation and back translation is key to ensure consistency Continuity of country team leaders and fieldworker coordinators.

Challenges: 3. Across countries coordination and comparability

19

Challenges:Maintaining increasingly complex survey instrumentsMaintaining strong coordination and liaison between Quant/Qual/ School survey teamsParticipant recall

Risk mitigating strategies:Piloting and training are crucial!Ensure research questions work in the field and are consistent with local situations and childrens agesEnsure questionnaire are not too long / burdensomeTrain teams and learn from practical experience of field work Produce accurate instrument manuals and protocolsEnsure that good data collection systems are in place

Consistency checks are embedded in CAPI, some information are prefilled, ultimately some inconsistencies can be solved ex-post

Challenges: 4. Quality of the data

20

CAPI introduced in R4 is a different way of doing surveys (e.g. changes dynamic of interview)

Benefits: Eliminate data entry error. Know how work is progressingAvoid mistakes before they happen (embedded skip pattern)

Challenges:Requires more time at the front end (building the programme)Fieldworkers to get familiar with a new instrumentsPut in place a data management and transfer systems Devolve responsibilities to the in-country data managers (in Peru and Vietnam)

Risk mitigating strategies:Extra effort at the front end in programmingPiloting and testing the application is crucial!Training country data managers and fieldworkers on data management and transfer systems.

Challenges: 5. Introducing CAPI

21

FINDING OUT MORE : www.younglives.org.uk

THANK YOU!

22They are not competing methodologies: but rather to employ each to triangulate between methods, and to use one to inform the other. Gain invaluable insights into how the risks and opportunities children encounter along the way can impact on their long-term outcomes.Understanding the problem to design effective policies: what, when, how to intervene

22

Annex

23

Young Lives in pillsMulti-disciplinary study that aims to:- improve understanding of childhood poverty and inequalities- provide evidence to improve policies & practice

Young Lives components: Household survey (child, caregiver, younger siblings, children of the YL children, community representatives); Longitudinal qualitative research; School survey: parallel to round 2 and 5 of the household survey.

Following nearly 12,000 children in 4 countries: Ethiopia; India (Andhra Pradesh & Telangana); Peru and Vietnam

Over a 15-year period: first data collected in 2002, with 5 survey rounds

Two age cohorts in each country:- 2,000 children born in 2000-01- 1,000 children born in 1994-95

Collaboration: Partners in each study countryPublicly archived survey data (UK Data Archive and listed on the World Bank Micro Data website) and core-funded by DFID

24

24

Step 1: Design The Survey Questionnaire

Step 2: Tracking and preparing CAPI programme

Step 3: Training and piloting

Step 4: Fieldwork

Step 5: Data cleaning, validations

Step 6: Preliminary analysis and Research

Six steps from design to the field

25

Demographic information (hh roster), socio economic indicators (wealth index, food consumption)Health information and anthropometrics (YL child, parents, siblings and child of YL children)Education history (all hh members) and cognitive skills (YL child, siblings)Subjective wellbeing and psychosocial competencies (YL child, siblings)Employment status/history and time use (all hh members) Job related skills Job and Educational Aspirations/expectations (YL child, parents)Expectations about marriage and parenthood (YL child, parents)Fertility historyMarriage/cohabitation historyControl over assets (intra-household decision making)Social norms indicatorsKnowledge on SRH and access to contraceptivesSexual behaviours, risky behaviours and criminal activities (Peru)

Information collected

26

Source: Outes and Dercon, 2008Non-random attrition

Attriting households (R1-R2) tend to have fewer assets, poorer access to services and utilities and are less educated (more in Ethiopia and India than Peru and Vietnam) (Panel A)These averages hide substantial variation between different types of attriting households (Panel B)The presence of non-random attrition does not necessarily imply attrition bias: no attrition bias found when looking (Ethiopia is an exception)

27

Ethiopia

Sampling design (1)Four stages sampling process:Regions (Amhara, Oromia, SNNPR, Tigray and Addis Ababa, accounting for 96% of national population)Woredas (districts) (3-5 districts in each regions, 20 in total)Kebele (at least 1 for each woredas)100 young children (born in 2001-02) and 50 older children (born in 1994-5) were selected within those sites.

Criteria to select districts:Districts with food deficit profileDistricts which capture diversity across regions and ethnicities in both urban and rural areasManageable costs in term of tracking for the future rounds

Comparing with DHS and WMS 2000: 2000:Poor hh are over-sampled, but YL covers the diversity of children in the country including up to 75% percentile of the Ethiopian population.

28If a selected family had both 1-year-old and 8-year-old children, the younger child was included (since a greater number needed to be enrolled).WMS: Welfare Monitoring Survey

28

India

Sampling design (2)

Four stages sampling process:AP, TelanganaDistricts 20 sentinel sites (defined by mandal)100 young children (born in 2001-02) and 50 older children (born in 1994-5) were randomly selected within those sites.

Criteria to (rank &) select districts & mandals:Economic development (per capita income, % of urban population)Human development (female literacy, infant mortality, etc.)Infrastructure (total length of road per 100km2, n. of hospital beds per 10,000 people).- One poor and one non-poor district/mandal in each region/district (districts selected for sampling covered approximately 28% of the state population )

Comparison to the DHS 1998/9: YLs hh seem to be slightly wealthier than the average household in Andhra Pradesh. YL sample covers the diversity of children in poor households in Andhra Pradesh

29-The old Andhra Pradesh (AP & Telangana) includes 23 administrative districts for a total of 1,125 mandals-In total, the districts selected for sampling covered approximately 28% of the state population and include around 318 of the 1119 mandals (excluding Hyderabad).

Mandals: The third step was to select mandals to be sentinel sites. Since there are relatively few urban mandals, the districtcapital was invariably chosen in urban areas, and one site was chosen from the urban slums of Hyderabad. The remaining sentinel sites were selected by ranking mandals within the six selected districts, again using development indicators.

Villages: Each mandal/sentinel site was divided into four contiguous geographical areas and one village randomly selectedfrom each area. Where sufficient children were not identified from the selected sample villages, additional villages were included. In urban areas, municipal wards were defined as communities and identified using the Census codes. In Hyderabad, three slum areas in different parts of the city were selected.

A comparison to the DHS 1998/9 (the year closest to Round 1 of Young Lives in 2002), indicates that the Young Lives sample includes households with better access to services and more ownership of assets and thus includes some biases. A comparison on the wealth index scores reveals that the Young Lives households seem to be slightly wealthier than the average household in Andhra Pradesh. These differences could be accounted for in part by the earlier data collection year of the DHS. Despite these biases, it is shown that the Young Lives sample covers the diversity of children in poor households in Andhra Pradesh

29

Peru

Sampling design (3)Sampling process:Sample frame at district level excluding the top 5% richest district based on poverty map 2001Districts divided in population groups ordered by poverty index and randomly selected to cover rural, urban, peri-urban coastal, mountain and amazon areas (random selection proportional to district population)Within the selected districts a village was randomly chosen Within each village the street blocks were counted and randomly numbered to select the starting point.

Comparison to the DHS 2000: YL cover the diversity of children and hh in Peru

30The project team visited a total of 36,373 dwelling to recruit 2,751 children. Although this may seem high, we estimated (usingcensus data) that we would need to visit 13 families to recruit one child of the right age.

A comparison of Round 1 to the Demographic and Health Survey 2000 (DHS 2000) shows that the Young Lives sample covers the diversity of children and families in Peru (Escobal and Flores 2008). At the same time, their analysis indicates that, on average, the Young Lives sample includes households with more education, with better access to services and more ownership of assets than in the DHS. However, this does not take into account the fact that in the Young Lives sample for Peru, each district had a probability of being selected proportional to its population size. Once each observation is adjusted to account for this, many of the differences found between the Young Lives and the DHS 2000 samples are not significant. For this reason, we report results for the Young Lives sample in Peru using the sampling frame, as these are the results that most closely resemble what is happening in the country.

30

Vietnam

Sampling design (4)

Four stages sampling process:Regions (5/8 regions, North-East region, Red River Delta, City, South Central Coast, Mekong Delta.Provinces (5 in total , 1 per region, Lao Cai, Hung Yen, Da Nang Phu Yen, Ben Tre).Sentinel sites (4 commune per province, 2 poor, 1 average and 1 above-average commune )100 young children (born in 2001-02) and 50 older children (born in 1994-5) were selected within those sites.

Criteria to rank communes:Development of infrastructure, Percentage of poor households in the communeChild malnutrition status.

Comparison to the DHS and VHLSS 2002: The urban sector is under-represented . YL includes hh with on average less access to basic services and slightly poorer than the average in Viet Nam. YL sample covers the diversity of children in the country.

31Among the 31 communes initially selected, 15 were from the poor group (48%), nine from the average group (29%), and seven (23%) from the above-average group.

Other criteria used in the selection were: the commune should represent common provincial features; commitment from local government for the research; feasibility of research logistics; population size.VietNam Household Living Standards Survey (VHLSS)

31