imputation and editing of income from the administrative file in israel ’ s censuses prepared by...

21
Imputation and Editing of Income from the Administrative File in Israel’s Censuses prepared by Orly Furman and Dmitri Romanov UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing Ljubljana, May 2011

Upload: reynold-roberts

Post on 01-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Imputation and Editing of Income from the Administrative File

in Israel’s Censuses

prepared by Orly Furman and Dmitri Romanov

UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE

CONFERENCE OF EUROPEAN STATISTICIANSWork Session on Statistical Data Editing

Ljubljana, May 2011

Use of Administrative Income Files in Israel’s Censuses

1995 Census2008 Census

QuestionnaireEmployment

Salary and self-employment income (net and gross)

Employment

Reference periodSeptember 1995December 2008

Calendar year 2008

Administrative income files

Consistency checks of reportage on salary;

Imputation of 8.6% records when data missing

Imputation of salary and self-employment income for all records;

Consistency checks of reportage on employment

Administrative Income File used in the 1995 Census

Source: National Insurance Institute. Coverage: 1.965 million employee posts, months of work

and annual salary and wage of 1.815 million employees, as reported to the NII.

Adm. fileAdm. file for the 20% census sample

Reported by the 20% census sample

Diff. census-adm. file for the census sample, %

Average wage, NIS

4,7064,7633,978-16.5

Emplyees, 000

1,815.11,705.01,566.5-8.1

Amendments to the Salary Data in the 1995 Census

Treatment/amendmentPercentage of total

Non-amended value69.0

Gross salary imputed by regression from net salary

21.0

Imputation of data from adm. file8.6

Editing of irregularities (division by 100/10)1.2

Other editing0.2

Total100.0

Reporting Salary in Census: Rounding

0

2

4

6

8

0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 11,000 12,000

Reported salary, NIS

Prevalence, percentage

Distribution of September Salary Reported in the Census

Reporting Salary in Census: Confounding Net and Gross

-40

-20

0

20

40

0 20 40 60 80 100

Annual salary percentiles

Deviation from “gross” salary Deviation from “net” salary

Deviation of the September Salary Reported in the Census from the Gross and Net Monthly Salary Per Job in the Administrative File, by

Annual Salary Percentiles, as a percentage of gross calculated salary

Imputing Salary in Census: Challenge of Multiple Jobs

Distribution of Values Imputed on the Basis of the Administrative File for Employees who Held One Job (Left) and More Then One Job (Right) in 1995

Administrative Income Files Used in the 2008 Census

Source: Tax Authority.

Coverage: employee posts, months of work and annual salary and wage of the employees, and business income of the self-employed individuals.

Usage: Imputation of earnings (salary and business income of the self-employed), conditional on workforce status as reported in the Census and occurrence in the administrative income files.

Challenge: Treatment of inconsistencies between the two sources, due to misreporting in the Census, or/and omissions of the administrative file.

Identification of Cases in Which the Census Data and the

Administrative Data do Not Coincide

Discrepancies between the Census Data and the Administrative Data

• Group A: Individuals that were found to have work income in 2008 as per the administrative income data base, which according to the census did not belong to the annual workforce.

• Group B: Individuals that reported in the census as belonging to the annual workforce, but were not found to have work income according to the administrative data bases.

Analysis of Cases in Group A

• 67% of individuals in this group are in the primary working age-group (19 to 65). 51% worked in 2008, according to the income tax data, less than half a year. This reinforces the hypothesis that under-reporting of employment in the Census is connected to irregular employment over the year.

• For 43% of this group, a record exists in the administrative income data base for December 2008.

• 74% of the individuals having work income in December, who did not report employment in the census, were in in the primary working age-group; For two thirds of them the income data base includes information on ongoing employment in 2008, for over six months of employment. This indicates a high probability of inaccurate reporting in the census with respect to labour market non-participation.

Analysis of Cases in Group B

Work statusDistribution as reported in the census

Absent from the income data base, % of cell

Total100.012.3

Employees86.310.7

Self employed – not employing

8.316.5

Self employed – employing4.412.7

Cooperative members0.125.6

Kibbutz members0.875.6

Unpaid family members 0.151.2

Analysis of Cases in Group B

Work statusDistribution as reported in the census

Absent from the income data base, % of cell

Total100.012.3

Employees86.310.7

Self employed – not employing

8.316.5

Self employed – employing4.412.7

Cooperative members0.125.6

Kibbutz members0.875.6

Unpaid family members 0.151.2

Analysis of Cases in Group B

• The work hypothesis is that the absence of information on employees and the self-employed is due to late or failed reporting by employers and self employed individuals to the income tax authority.

• Accordingly, the employer of an employee who was absent from the 2008 income data base should be examined, to check whether the employee was active in the preceding year.

• The examination shows that more than 50% worked in 2007 and have employee jobs. 80% of these work for employers that did not report in 2008 but did report in 2007.

Algorithm of Income Imputation

GroupIncome recording method

% of total

imputed cases

% of cases

reported in

census

Found to have work income according to income data base but do not belong to the workforce according to the census.

Work months and salary imputed as per the income data base.

61.77.9

Algorithm of Income Imputation (cont.)

GroupIncome recording method

% of total

imputed cases

% of cases

reported in

census

Belong to the workforce according to the census but found not having work income according to the income data base, found to be employed by employers in 2007 that did not report in 2008

The individual’s salary for 2007 was imputed, adjusted for the average salary increase in the economic industry.

15.21.9

Algorithm of Income Imputation (cont.)

GroupIncome recording method

% of total

imputed cases

% of cases

reported in

census

Belong to the workforce according to the census but found not having work income according to the income data base, found to be reporting self employed individuals in 2007 who did not report in 2008

Income was imputed for holders of active files in 2007, adjusted for the average income increase in the economic industry.

2.60.3

Algorithm of Income Imputation (cont.)

GroupIncome recording method

% of total

imputed cases

% of cases

reported in

census

Belong to the workforce according to the census but found not having work income according to the income data base, military personnel, housekeepers, caretakers and unknown denotation of occupation

Income was imputed from the ongoing survey, according to the average income as per defined estimation cells*.

3.60.5

Algorithm of Income Imputation (cont.)

GroupIncome recording method

% of total

imputed cases

% of cases

reported in

census

Individuals who reported having worked in the census but do not belong to the abovementioned groups

Income was imputed based on average income in the defined estimation cells**, according to the number of months worked as reported in the census.

16.92.1

The Bottom Line

• All in all, only in 12.7% of cases that reported employment in the 2008 census discrepancies between the reportage and the administrative source were treated, and income information from the administrative file was amended. In 87.3% the data on earnings of the employees and the self-employed from the administrative file was imputed.

• In contrast, income data as reported in the “traditional” 1995 census, had to be amended or imputed in 29.6% of cases.

Thank you!