quality in italian consumer price survey: optimal allocation of resources and indicators to monitor...

Post on 17-Dec-2015

218 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Quality inItalian consumer price

survey:

optimal allocation of resources and indicators to monitor the data collection

process

Federico Polidoro, Rosabel Ricci, Anna Maria Sgamba

( Istat - Italy )

introduction

quality in Consumer Price Survey

two research topics1. the optimal allocation of the available resources (minimizing sample error + burden and cost)

2. the definition of a system of indicators to monitor data

collection process (minimizing non sample error)

the calculation of a consumer price index (CPI) requires a large amount of

resources

the optimal allocation of the available resources

introduction

allocating these resources in the most efficient way (quality: burden

and cost)

the aim

the issue

indicators to monitor data collection process

introduction

improving data quality (quality: accuracy)

the definition of a system of indicators to monitor data

collection process

the issue

the aim

1. the optimal allocation of the

available resources

1. the optimal allocation of the available resources Approach description

Italian background

Approach to variance estimation

Cost function

Case study and results

1. the optimal allocation of the available resources

identifying the optimal sample sizes either in terms of outlets or in terms

of elementary items observed in order to minimize sample error measured by sample variance

the objective of this research

1. the optimal allocation of the available resources

the optimal allocation approach

1. the optimal allocation of the available resources

derive optimal sample sizes minimizing variance of the estimates

for a given cost

a variance function

a cost function

2 pillars

in order to

Italian background

consumer price index sampling structure

Sampling of geographical areas

Sampling of outlets

Sampling of products

Sampling of elementary items in each outlet

1. the optimal allocation of the available resources

consumer price index sampling design

non-probability sampling

consumer price index sampling methods

Italian background

1. the optimal allocation of the available resources

Italian background

consumer price index sampling methods

Sampling of geographical areas

the selection of geographical areas is established by Italian laws

(No 222/1927 and 621/1975)

in 2007 prices were collected in 85 county chief towns (Municipal Offices of

Statistics, MOS) all over the national territory

1. the optimal allocation of the available resources

Italian background

Sampling of outletswithin each county chief towns, the

selection of outlets is carried out by MOS

sample is drawn by outlet list of the Chamber of commerce, statistical business

register (ASIA), census data and other local sources

the outlets with the highest total sales are chosen (mix of cut-off and quota sampling)

in 2007 prices are collected in about 40.000 outlets all over the national territory

consumer price index sampling methods

1. the optimal allocation of the available resources

Sampling of products

in 2007, 540 products are included in the CPI’s

consumer price index sampling methods

the selection of products is carried out by National Institute of Statistics (Istat)

the selection of the products - a list (basket) of products types with product type

specifications - is based on sales data

(cut-off sampling)

Italian background

1. the optimal allocation of the available resources

Sampling of elementary items in each outletwithin each outlet, the selection of elementary items is carried out by

MOS’s price collector

the most sold elementary items is chosen (the representative item

method)

in the 2007 about 400.000 price quotations are collected all over the national territory

consumer price index sampling methods

Italian background

1. the optimal allocation of the available resources

sample update

yearly base revision

consumer price index sampling methods

optimum sample allocation

current sizes of samples for elementary items are not optimal

Italian background

1. the optimal allocation of the available resources

the approach to variance estimation

1. the optimal allocation of the available resources

The Swedish approach has been used to estimate the variance of CPI (Dalén,

Ohlsson, 1995)

the sample is considered drawn from a two-dimensional population of products and

outlets

a cross-classified sample (CCS)

the approach to variance estimation1. the optimal allocation of the available resources

representative products – as rows (i)

outlets – as columns (j)

stratification into categories of products – stratum (g)

stratification into outlet groups – stratum (h)

the crossing of strata - cell (g,h)

the parameter (index) = I

parameter estimator (index) = Î

the approach to variance estimation1. the optimal allocation of the available resources

the general index (target parameter)

Vgh = weight for cell turnover for the category of products g traded in the outlets of group h

hgI = ∑ ∑ Igh Vgh

where the cell index is Igh = index cell

the approach to variance estimation

1. the optimal allocation of the available resources

wi = weight for representative product i

wh = weight for outlet j

lij = 1 if representative product i is traded in outlet j

lij = 0 otherwise

fij1 =pij

1

(pij0 + pij

1 ) / 2

fij0 =pij

0

(pij0 + pij

1 ) / 2

Igh =

lij wi wj fij1∑ ∑i j

lij wi wj fij0∑ ∑i j

Ygh

Xgh

= the cell index

where

the approach to variance estimation

1. the optimal allocation of the available resources

the estimated general index

hgI = ∑ ∑ Îgh Vgh

hgI = ∑ ∑ Îgh Vgh

^Îgh =

lij fij1∑ ∑i j

lij fij0∑ ∑i j

Ŷgh

Xgh

= the estimated cell index

1. the optimal allocation of the available resources

in CCS assumption the variance estimator can be decomposed into:

VPRO = variance between representative products

VOUT = variance between outlets

VINT = outlet and representative product interaction variance

V(Î)tot ~ VPRO + VOUT + VINT

where

the approach to variance estimation

1. the optimal allocation of the available resources

formulas for variance estimation

the approach to variance estimation

gh^e.j

h nh ( nh - 1)

1 (1 - πhj)VOUT = ∑j

∑g

∑^

^

vgh

Xgh

2

gh^ei.

g mg ( mg - 1)

1 (1 - πgi)VPRO = ∑i

∑h

∑^

^

vgh

Xgh

2

1. the optimal allocation of the available resources

with the following formula for variance estimation

the approach to variance estimation

g h mg ( mg - 1)

1VINT = ∑ ∑^

nh ( nh - 1)

1

^

vgh

Xgh 2

2

gh^e.j

∑ ∑I j

(1 - πhj)(1 - πgi)gh^

ei.

gh^eij

( - - )2x

where

mg

gh^ei. =

1

i

∑ eijgh^

nh

gh^e.j =

1

j

∑ eijgh^ eij = 1ij (fij – Ighfij )^ gh 1 0^

Case study1. the optimal allocation of the available resources

one geographical areaUdine county chief town(Resident population: 96.750)

one COICOP division (two-digit level)“Food and non alcoholic beverages”

reference periodDecember 2007

Case study1. the optimal allocation of the available resources

Outlets are divided into 12 strata according a commercial distribution type (reduced to 5 types for Food and non alcoholic beverages)

Representative products are divided into 52 strata according to the national nomenclature (categories of products)

Currently for outlets and products purposive sampling is used but a probability sampling

has been postulated for both

the approach to variance estimation

Case study

1. the optimal allocation of the available resources

Inclusion probabilities for representative products (πgi)

Inclusion probabilities for outlets (πhj)

Imputation by brands information in each strata

Imputation by the amount of representative products collected in

each outlet

the approach to variance estimation

Case study

1. the optimal allocation of the available resources

main numerical results the approach to variance estimation

Sample size = 2.373

Î (index) = 103.979569

Food and non alcoholic beverages Division

VPRO = 0.009466

VOUT = 0.000904

VINT = 0.000719

VTOT = 0.011090

95% confidence interval

1. the optimal allocation of the available resources

the cost function

one data collection method

Thus the following function cost is used

interviewers collect prices each month by visiting each outlet

1. the optimal allocation of the available resources

the approach to cost function estimation

C0 = fixed cost (i.e. for administration and other)

nh = the number of outlets into stratum h

mg = the number of products into stratum g

ah = fixed cost per outlet into stratum h (i.e. for travel time)

bh = cost to measuring one product in the outlets of stratum h

rgh = average relative frequency of products in stratum g sold in outlets of stratum h

h

C = C0 + ∑ nh ah + bh ∑ mgrgh g

where

1. the optimal allocation of the available resourcesthe allocation problem

County chief town: Udine

Resident population: 96.750

Reference time: December 2007

Food and non alcoholic beverages price quotes: 2.373

Food and non alcoholic beverages outlets: 43

C0 = not considered

ah = we consider the average travel time h

bh = we consider the average collecting time h

Estimate CTOT = 182 h.

Case study

Conclusion

1. the optimal allocation of the available resources

• Developing the contents of the paper solving the problem of nonlinear optimization deriving from the Cost and Variance formula

• Important news: preliminary attempt to estimate Italian CPI variance

• Enhancing effort to move towards a probability approach to CPI sampling

2. indicators to monitor data

collection process

2. indicators to monitor data collection processData collection: the net design

Istat CPI Office

Data server

DB Oracle

E-mail server

FTP server Web

server

Firewall

intranetintranet

Data collecto

r

PSTN or

UMTS

Data collecto

r

Data collecto

r

PSTN or

UMTS

Data collecto

r

2. indicators to monitor data collection process8

Different steps of data check and data quality indicators

1. Data collection software

2. UMTS data transmission for each outlet or data collection tour: first check and first data quality set of indicators on the web server (possible real time data in the outlet)

2. indicators to monitor data collection process8

3. Second check on the total amount of monthly elementary data and second data quality set of indicators (MOS)

4. Final check (the third one) on total amount of elementary data coming from all the chief towns (Istat) and third set of data quality indicators

5. Quarterly check concerning sampling

Different steps of data check and data quality indicators

2. indicators to monitor data collection process8

A completely integrated data production process where each event that will be stressed by the system of indicators will produce consequences in order to remove mistakes or their

possible causes

Different steps of data check and data quality indicators

Thank you for your attention

Federico Polidoro (Istat - Italy,

polidoro@Istat.it)

Rosabel Ricci (Istat - Italy, roricci@Istat.it)

Anna Maria Sgamba (Istat - Italy,

sgamba@Istat.it)

top related