merging

Post on 14-Apr-2017

160 Views

Category:

Services

10 Downloads

Preview:

Click to see full reader

TRANSCRIPT

INDIA HUMAN DEVELOPMENT SURVEY (IHDS) TRAINING PROGRAM MARCH 16, 2016

How to merge two rounds?

Merging Household Files

Relationship between IHDS-I and IHDS-II households

IHDS-I sample(N=41,554)

Replacement households in

IHDS-II (N=2,134)

Split households from round 1

(N=5,397)

Reinterview Households (N=34,621)

Attrition (N=6,911)

Most important concept in merging two data files

1. Some households in round 1 with no match in round 2 and vice versa

2. Households in round 1 match with more than 1 household in round 2

Any questions? Who were chosen for reinterview? Recontact rate of 83%? What does it

mean? How were replacement households

chosen? What is a split household?

What is needed to merge household files?1. Round 1 household file – N=41,5542. Round 2 household file – N=42,152

(Why are there more cases in round 2?)3. Linking file – N=42,152 – gives Round

1 identification codes for all Round 2 households that were reinterviewed, missing linking codes for 2,134 households that are new

Step 1 – Link round 2 data to linking file to get round 1 ID use linkhh, clear sort STATEID DISTID PSUID HHID

HHSPLITID merge 1:1 STATEID DISTID PSUID

HHID HHSPLITID using round2HH sort STATEID DISTID PSUID HHID2005

HHSPLITID2005, gen(_mergeR2link) save round2HH_plus, replace

Step 2-Merge this Round 2+ file with Round 1 file use round1HH rename HHID HHID2005 rename HHSPLITID HHSPLITID2005 sort STATEID DISTID PSUID HHID2005

HHSPLITID2005 merge 1:m STATEID DISTID PSUID HHID2005

HHSPLITID2005 using round2HH_plus, gen(_mergeR1R2)

sort STATEID DISTID PSUID HHID HHSPLITID save mergedHHR1R2, replace

Cases in Merged file is superset Households surveyed in both rounds

N=40,018 Households surveyed in round 1 only

(attrition) N=6,911 Households surveyd in round 2 only

(replacement) N=2,134

Total N=49,063 Keep only _mergeR1R2==3 for panel

analysis (N=40,018)

Merging Individual Files

Relationship between IHDS-I and IHDS-II individuals

IHDS-I sample

(N=215,754)

New individulas,

new HH (N=9,760)

New Ind in R1 HH

(N=43,822)

Reinterview Ind

(N=150,995)

HH attrition (N=29,299)

Ind. attrition in interview

hh (N=35,464)

Most important concept in merging two data files

1. Even reinterview households have new members (births, marriages)

2. Even reinterview households have some members who are no longer there (deaths, marriages, migration)

What is needed to merge individual files?1. Round 1 household file – N=215,7542. Round 2 household file – N=204,568

(Why are there more cases in round 2?)3. Linking file – N=204,568 – gives Round

1 identification codes for all Round 2 households that were reinterviewed, missing linking codes for 2,134 households that are new

Step 1 – Link round 2 data to linking file to get round 1 ID use linkind, clear sort STATEID DISTID PSUID HHID

HHSPLITID PERSONID merge 1:1 STATEID DISTID PSUID

HHID HHSPLITID PERONID using round2IND

sort STATEID DISTID PSUID HHID2005 HHSPLITID2005, gen(_mergeR2link)

save round2IND_plus, replace

Step 2-Merge this Round 2+ file with Round 1 file use round1IND rename HHID HHID2005 rename HHSPLITID HHSPLITID2005 rename PERSONID PERSONID2005 sort STATEID DISTID PSUID HHID2005

HHSPLITID2005 PERSONID2005 merge 1:m STATEID DISTID PSUID HHID2005

HHSPLITID2005 PERSONID2005 using round2IND_plus, gen(_mergeR1R2)

sort STATEID DISTID PSUID HHID HHSPLITID save mergedINDR1R2, replace

Cases in Merged file is superset Individuals surveyed in both rounds

N=150,988 Individuals surveyed in round 1 only

(attrition/death/migration) N=64,766 Individuals surveyd in round 2 only

(replacement/new) N=53,580

Total N=269,334 Keep only _mergeR1R2==3 for panel

analysis (N=150,988)

Evermarried woman file linkage

Same process as individual file linkage But only one thing to note, there was no

ever married woman file for 2004-5 so you will be merging with the household file from 2004-5

Merging Caution

Merging overwrites variables So if you want to keep variables from

round 1 and round 2 separate, before merging you may want to rename all round 1 variables

Typically we use the command Rename * x* Rename xSTATEID STATEID et. For

merging So xr05 will be age in 20045 and r05

will be age in 2011-12

top related