part ii – introduction to silc data structure and documentation
DESCRIPTION
Part II – Introduction to SILC Data Structure and Documentation. DwB Training Course on EU-SILC Longitudinal data Paris, 19-21 February 2014 Heike Wirth. Aims of this session. Introduce the rotational design Explain the concept of the selected respondent - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/1.jpg)
Part II – Introduction to SILC
Data Structure and Documentation
DwB Training Course on EU-SILC Longitudinal dataParis, 19-21 February 2014
Heike Wirth
![Page 2: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/2.jpg)
2
• Introduce the rotational design• Explain the concept of the selected respondent• Explain the organisation of the data• Point out some reading: Documents of priority
Aims of this session
![Page 3: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/3.jpg)
3
Illustration of the rotational design
![Page 4: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/4.jpg)
4
Rotational design - Illustration
Initial sample
2006
![Page 5: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/5.jpg)
5
Rotational design – Illustration cross-sectional
2006
![Page 6: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/6.jpg)
6
Rotational design – Illustration longitudinal
![Page 7: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/7.jpg)
7
Rotational design – Illustration longitudinal
e.g. longitudinal data 2011
2006
![Page 8: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/8.jpg)
8
Rotational design – empirical Not equivalent to the number of years of participation
![Page 9: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/9.jpg)
10
Rotational design – empirical tab DB075 HHYNR
HHYNR (number of hh-year)
HHYNR(= number of household year) is not included in the data, must be createdSource: UDB_l11D_ver 2011-1 from 01-08-2013.dta; own calculations
![Page 10: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/10.jpg)
11
Rotational design - empirical
HHYNR(= number of household year) is not included in the data, must be createdSource: UDB_l11D_ver 2011-1 from 01-08-2013.dta; own calculations
HHYNR (number of hh-year)
tab HHYNR YEAR
![Page 11: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/11.jpg)
12
Rotational design - empirical
HHYCOUNT (= count of household-years) is not included in the data, must be createdSource: UDB_l11D_ver 2011-1 from 01-08-2013.dta; own calculations
tab HHYCOUNT HHYNR
HHYCOUNTHHYNR
![Page 12: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/12.jpg)
13
Observation UnitsConcept of the selected respondent
![Page 13: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/13.jpg)
14
Selected respondent
Collection unit/data source
Type of information Observation unit Survey countries Register countries
Social exclusion, housing, childcare … Household (HH) HH-Respondent Registers/HH-R
Basic demographic personal data All HH-members HH-Respondent Registers/HH-R
Basic personal data on education, labour information, income …
All HH-members aged 16+
All HH-members aged
16+Registers/HH-R
Detailed personal data on health, access to health care, labour market activity …
All HH-members aged 16+ orSelected respondent
All HH-members aged
16+
Selected respondent(One person 16+ per
Household)
![Page 14: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/14.jpg)
15
![Page 15: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/15.jpg)
16
Example: PH030- Limitation in activities because of health problems (register countries)
Source: UDB_l11P_ver 2011-1 from 01-08-2013.dta
(mainly) not selected respondents (see PH030_F)
![Page 16: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/16.jpg)
17
Organisation of the data
![Page 17: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/17.jpg)
18
EU-SILC consists of 4 separate files for the cross-sectional data
Organisation of the data
Household Register FILE
Household Data FILE
Personal Register FILE
Personal Data FILE
![Page 18: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/18.jpg)
19
… and of 4 separate data files for the longitudinal data
Organisation of the data
Household Register FILE
Household Data FILE
Personal Register FILE
Personal Data FILE
![Page 19: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/19.jpg)
20
Household Files- longitudinal Household Register
D-File
• Includes every selected household (also those where the address could not be contacted or which could not be interviewed)
> 19 variables: household
identifier, sampling design information, region
Household DataH-File
• Only households which have been contacted and completed a hh interview and at least one hh member has complete data in the personal data file
> 180 variables (incl. flag-variables & imputation-factors): basic data, social exclusion, income, housing
UDB_l11D_ver 2011-1 from 01-08-2013: N = 542 942 households
UDB_l11H_ver 2011-1 from 01-08-2013: N = 411 189 households
![Page 20: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/20.jpg)
21
Personal Files - longitudinalPersonal Register
R-File
• Every person currently living in hh or temporarily absent.
Longitudinal file: also persons registered in the R-File of the previous year or living at least 3 months in the hh during the income reference period.
> 50 variables (incl. flag variables): basic information e.g. relationship between household members
Personal DataP-File
• Only reference population (persons aged 16 and over) and only persons for whom the information could be completed by interview (personal/proxy) and/or register
> 190 variables (incl. flag variables & imputation factors): e.g. demographic, income, work and unemployment
UDB_l11R_ver 2011-1 from 01-08-2013 N=1,079,261 persons
UDB_l11P_ver 2011-1 from 01-08-2013; N= 879,720 persons
![Page 21: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/21.jpg)
22
Personal Register
Personal Data
Household Register
Household Data
Depending on the research question: Use of separate datasets
![Page 22: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/22.jpg)
23
Personal Register
Personal Data
Household Register
Household Data
…. or a combination of different datasets
![Page 23: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/23.jpg)
24
While for both, c-s and longitudinal data all 4 files are linkable among each other, c-s and longitudinal data are not linkable
Organisation of the data
Personal Register
PersonalData
Household Register
Household Data
Personal Register
PersonalData
Household Register
Household Data
cross-sectional data longitudinal data
![Page 24: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/24.jpg)
25
… as well as cross-sectional data are not linkable over time (HH-ID and related identifaction variables are randomized)
Organisation of the data
Personal Register
PersonalData
HHRegister
HHData
t
Personal Register
PersonalData
HHRegister
hhData
t+1
![Page 25: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/25.jpg)
26
• In order to link (combine) the four files D, H, R and P among each others all observations must have a unique link to the respective three other files
This link is achieved by the following 4 key variables (1) Year of Survey (2) Country (3) Household ID (4) Personal ID
Organisation of the data… combine different datasets – Key Variables
![Page 26: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/26.jpg)
27
Organisation of the data… combine different datasets – Key Variables
Personal Register
Personal Data
Household Register
Household Data
Year of SurveyCountry Household IDPersonal ID
Year of SurveyCountry Household ID
Year of SurveyCountry Household ID
![Page 27: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/27.jpg)
28
• Household ID • Cross-sectional (max. 6 digits) = hh number 1-999999 • Longitudinal (max. 8 digits) = hh number 1-999999 + split number
Default split number = 00
• Personal ID• Cross-sectional = hh-id + personal number (max 2 digits)• Longitudinal = hh number + default split number (00) + personal number
In the longitudinal survey the Personal ID never changes, even if the person moves to a different household
in the cross-sectional survey, from year to year the Household ID and Personal ID may change
Organisation of the data Household ID – Personal ID
![Page 28: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/28.jpg)
29
The 4 key variables – illustration (longitudinal data)year country hh_id pers_id year of birth2010 A 40017100 4001710001 19372010 A 40017100 4001710002 19392011 A 40017100 4001710001 19372011 A 40017100 4001710002 19392009 B 40017100 4001710001 19532009 B 40017100 4001710002 19562009 B 40017100 4001710003 19822009 B 40017100 4001710004 19842009 B 40017100 4001710005 19852010 B 40017100 4001710001 19532010 B 40017100 4001710002 19562010 B 40017100 4001710003 19822010 B 40017100 4001710004 19842010 B 40017100 4001710005 19852010 B 40017101 4001710003 19822010 B 40017101 4001710004 19842011 B 40017100 4001710001 19532011 B 40017100 4001710002 19562011 B 40017100 4001710005 19852011 B 40017101 4001710002 19562011 B 40017101 4001710003 19822011 B 40017101 4001710004 1984
![Page 29: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/29.jpg)
30
Combining information from two separate files at a 1:1 level
![Page 30: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/30.jpg)
31
Combined data
![Page 31: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/31.jpg)
32
Combining information from two separate files at a 1:n level
![Page 32: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/32.jpg)
33
Combined data
![Page 33: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/33.jpg)
34
Create household level variables from personal level data, e.g.
• number of current household members• persons < 18 in household• age of the youngest child in household• Number of unemployed hh-members• Highest educational level in household • …
Use of separate sub datasets
![Page 34: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/34.jpg)
35
new hh-level variables added from hh-datayear country hh_id pers_id RX010 hhsize numchild ychild HX0802010 a 6800 680001 36 3 1 17 02010 a 6800 680002 35 3 1 17 02010 a 6800 680003 17 3 1 17 02011 a 6800 680001 36 3 0 . 02011 a 6800 680002 36 3 0 . 02011 a 6800 680003 18 3 0 . 02011 b 6800 680001 69 2 0 . 02011 b 6800 680002 73 2 0 . 02010 b 7000 700001 80 2 0 . 02010 b 7000 700002 80 2 0 . 02008 c 7000 700001 42 3 1 2 12008 c 7000 700002 34 3 1 2 02008 c 7000 700003 2 3 1 2 02009 c 7000 700001 43 3 1 3 02009 c 7000 700002 35 3 1 3 02009 c 7000 700003 3 3 1 3 02010 c 7000 700001 44 3 1 4 12010 c 7000 700002 36 3 1 4 12010 c 7000 700003 4 3 1 4 12011 c 7000 700001 45 4 2 0 12011 c 7000 700002 37 4 2 0 12011 c 7000 700003 5 4 2 0 12011 c 7000 700004 0 4 2 0 1
Create new household level summary variables from person level information, e.g. household size, number of children, age of youngest child (< 18 years)
![Page 35: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/35.jpg)
36
Some reading – Documents of priority
![Page 36: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/36.jpg)
37
Guidelines_Doc65_2011.pdf • General technical information on sample design, weights, etc.• List of all variables included in the original EU-SILC data base• Description of (cross-sectional and longitudinal) variables
DIFFERENCES BETWEEN DATA COLLECTED AND UDB.doc • List of variables removed or added to Userdata Base (UDB)• Methods of anonymisation
SILC L-2011 UDB PROBLEMS AND MODIFICATIONS.xls
National and EU Quality reports • http://epp.eurostat.ec.europa.eu/portal/page/portal/
income_social_inclusion_living_conditions/quality
Some reading – Documents of priority
![Page 37: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/37.jpg)
38
Some reading – Documents of priorityGuidelines_Doc65_2011.pdf
Source: Guidelines_Doc65_2011.pdf
![Page 38: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/38.jpg)
39
Some reading – Documents of priority
Flag VariableHH020_F
Source: Guidelines_Doc65_2011.pdf
![Page 39: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/39.jpg)
40
Some reading – Documents of priority
Flag VariableHH021_F
Source: Guidelines_Doc65_2011.pdf
![Page 40: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/40.jpg)
41
Some reading – Documents of priorityCross-sectional data 2011
Source: UDB_c11H_ver 2011-2 from 01-08-13.dta
![Page 41: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/41.jpg)
42
Some reading – Documents of priorityLongitudinal data 2011
Source: UDB_l11H_ver 2011-1 from 01-08-2013.dta
New (HH021)
Old
(HH
020)
![Page 42: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/42.jpg)
43
Some reading – Documents of priorityExample: variable included in the cross-sectional and longitudinal data
Source: Guidelines_Doc65_2011.pdf
![Page 43: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/43.jpg)
44
Some reading – Documents of priorityExample: variable included in the cross-sectional only
Source: Guidelines_Doc65_2011.pdf
![Page 44: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/44.jpg)
45
Some reading – Documents of priorityExample: variable included in longitudinal data only
Source: Guidelines_Doc65_2011.pdf
![Page 45: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/45.jpg)
46
Some reading – Documents of priorityExample: selected respondent
Source: Guidelines_Doc65_2011.pdf
![Page 46: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/46.jpg)
47
Some reading – Documents of priorityDifferences between data collected and Userdata Base (cross-sectional file)
![Page 47: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/47.jpg)
48
Some reading – Documents of priorityDifferences between data collected and Userdata Base (longitudinal file)
Source: L2011 DIFFERENCES BETWEEN DATA COLLECTED AND UDB.doc
![Page 48: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/48.jpg)
49
Some reading – Documents of priorityDifferences between data collected and Userdata Base (cross-sectional file)
![Page 49: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/49.jpg)
50
Some reading – Documents of priorityDifferences between data collected and Userdata Base (longitudinal file)
![Page 50: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/50.jpg)
51
Some reading – Documents of prioritySILC L-2011 UDB PROBLEMS AND MODIFICATIONS.xls
Source: SILC L-2011 UDB PROBLEMS AND MODIFICATIONS.xls
![Page 51: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/51.jpg)
52
Some reading – Documents of priorityQuality reports
![Page 52: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/52.jpg)
53
Data Structure – Some readingNational quality reports
![Page 53: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/53.jpg)
54
Data Structure – Some readingE.G. Austria: Final Quality Report Relating to the EU-SILC Operation 2007-2010
![Page 54: Part II – Introduction to SILC Data Structure and Documentation](https://reader035.vdocument.in/reader035/viewer/2022062316/56816197550346895dd14636/html5/thumbnails/54.jpg)
55Source: Austria, Final Quality Report Relating to the EU-SILC Operation 2007-2010, p. 7