merging

18
INDIA HUMAN DEVELOPMENT SURVEY (IHDS) TRAINING PROGRAM MARCH 16, 2016 How to merge two rounds?

Upload: shantanu-mishra

Post on 14-Apr-2017

159 views

Category:

Services


10 download

TRANSCRIPT

Page 1: Merging

INDIA HUMAN DEVELOPMENT SURVEY (IHDS) TRAINING PROGRAM MARCH 16, 2016

How to merge two rounds?

Page 2: Merging

Merging Household Files

Page 3: Merging

Relationship between IHDS-I and IHDS-II households

IHDS-I sample(N=41,554)

Replacement households in

IHDS-II (N=2,134)

Split households from round 1

(N=5,397)

Reinterview Households (N=34,621)

Attrition (N=6,911)

Most important concept in merging two data files

1. Some households in round 1 with no match in round 2 and vice versa

2. Households in round 1 match with more than 1 household in round 2

Page 4: Merging

Any questions? Who were chosen for reinterview? Recontact rate of 83%? What does it

mean? How were replacement households

chosen? What is a split household?

Page 5: Merging

What is needed to merge household files?1. Round 1 household file – N=41,5542. Round 2 household file – N=42,152

(Why are there more cases in round 2?)3. Linking file – N=42,152 – gives Round

1 identification codes for all Round 2 households that were reinterviewed, missing linking codes for 2,134 households that are new

Page 6: Merging

Step 1 – Link round 2 data to linking file to get round 1 ID use linkhh, clear sort STATEID DISTID PSUID HHID

HHSPLITID merge 1:1 STATEID DISTID PSUID

HHID HHSPLITID using round2HH sort STATEID DISTID PSUID HHID2005

HHSPLITID2005, gen(_mergeR2link) save round2HH_plus, replace

Page 7: Merging

Step 2-Merge this Round 2+ file with Round 1 file use round1HH rename HHID HHID2005 rename HHSPLITID HHSPLITID2005 sort STATEID DISTID PSUID HHID2005

HHSPLITID2005 merge 1:m STATEID DISTID PSUID HHID2005

HHSPLITID2005 using round2HH_plus, gen(_mergeR1R2)

sort STATEID DISTID PSUID HHID HHSPLITID save mergedHHR1R2, replace

Page 8: Merging

Cases in Merged file is superset Households surveyed in both rounds

N=40,018 Households surveyed in round 1 only

(attrition) N=6,911 Households surveyd in round 2 only

(replacement) N=2,134

Total N=49,063 Keep only _mergeR1R2==3 for panel

analysis (N=40,018)

Page 9: Merging

Merging Individual Files

Page 10: Merging

Relationship between IHDS-I and IHDS-II individuals

IHDS-I sample

(N=215,754)

New individulas,

new HH (N=9,760)

New Ind in R1 HH

(N=43,822)

Reinterview Ind

(N=150,995)

HH attrition (N=29,299)

Ind. attrition in interview

hh (N=35,464)

Most important concept in merging two data files

1. Even reinterview households have new members (births, marriages)

2. Even reinterview households have some members who are no longer there (deaths, marriages, migration)

Page 11: Merging

What is needed to merge individual files?1. Round 1 household file – N=215,7542. Round 2 household file – N=204,568

(Why are there more cases in round 2?)3. Linking file – N=204,568 – gives Round

1 identification codes for all Round 2 households that were reinterviewed, missing linking codes for 2,134 households that are new

Page 12: Merging

Step 1 – Link round 2 data to linking file to get round 1 ID use linkind, clear sort STATEID DISTID PSUID HHID

HHSPLITID PERSONID merge 1:1 STATEID DISTID PSUID

HHID HHSPLITID PERONID using round2IND

sort STATEID DISTID PSUID HHID2005 HHSPLITID2005, gen(_mergeR2link)

save round2IND_plus, replace

Page 13: Merging

Step 2-Merge this Round 2+ file with Round 1 file use round1IND rename HHID HHID2005 rename HHSPLITID HHSPLITID2005 rename PERSONID PERSONID2005 sort STATEID DISTID PSUID HHID2005

HHSPLITID2005 PERSONID2005 merge 1:m STATEID DISTID PSUID HHID2005

HHSPLITID2005 PERSONID2005 using round2IND_plus, gen(_mergeR1R2)

sort STATEID DISTID PSUID HHID HHSPLITID save mergedINDR1R2, replace

Page 14: Merging

Cases in Merged file is superset Individuals surveyed in both rounds

N=150,988 Individuals surveyed in round 1 only

(attrition/death/migration) N=64,766 Individuals surveyd in round 2 only

(replacement/new) N=53,580

Total N=269,334 Keep only _mergeR1R2==3 for panel

analysis (N=150,988)

Page 15: Merging

Evermarried woman file linkage

Page 16: Merging

Same process as individual file linkage But only one thing to note, there was no

ever married woman file for 2004-5 so you will be merging with the household file from 2004-5

Page 17: Merging

Merging Caution

Page 18: Merging

Merging overwrites variables So if you want to keep variables from

round 1 and round 2 separate, before merging you may want to rename all round 1 variables

Typically we use the command Rename * x* Rename xSTATEID STATEID et. For

merging So xr05 will be age in 20045 and r05

will be age in 2011-12