quality in italian consumer price survey: optimal allocation of resources and indicators to monitor...
TRANSCRIPT
Quality inItalian consumer price
survey:
optimal allocation of resources and indicators to monitor the data collection
process
Federico Polidoro, Rosabel Ricci, Anna Maria Sgamba
( Istat - Italy )
introduction
quality in Consumer Price Survey
two research topics1. the optimal allocation of the available resources (minimizing sample error + burden and cost)
2. the definition of a system of indicators to monitor data
collection process (minimizing non sample error)
the calculation of a consumer price index (CPI) requires a large amount of
resources
the optimal allocation of the available resources
introduction
allocating these resources in the most efficient way (quality: burden
and cost)
the aim
the issue
indicators to monitor data collection process
introduction
improving data quality (quality: accuracy)
the definition of a system of indicators to monitor data
collection process
the issue
the aim
1. the optimal allocation of the
available resources
1. the optimal allocation of the available resources Approach description
Italian background
Approach to variance estimation
Cost function
Case study and results
1. the optimal allocation of the available resources
identifying the optimal sample sizes either in terms of outlets or in terms
of elementary items observed in order to minimize sample error measured by sample variance
the objective of this research
1. the optimal allocation of the available resources
the optimal allocation approach
1. the optimal allocation of the available resources
derive optimal sample sizes minimizing variance of the estimates
for a given cost
a variance function
a cost function
2 pillars
in order to
Italian background
consumer price index sampling structure
Sampling of geographical areas
Sampling of outlets
Sampling of products
Sampling of elementary items in each outlet
1. the optimal allocation of the available resources
consumer price index sampling design
non-probability sampling
consumer price index sampling methods
Italian background
1. the optimal allocation of the available resources
Italian background
consumer price index sampling methods
Sampling of geographical areas
the selection of geographical areas is established by Italian laws
(No 222/1927 and 621/1975)
in 2007 prices were collected in 85 county chief towns (Municipal Offices of
Statistics, MOS) all over the national territory
1. the optimal allocation of the available resources
Italian background
Sampling of outletswithin each county chief towns, the
selection of outlets is carried out by MOS
sample is drawn by outlet list of the Chamber of commerce, statistical business
register (ASIA), census data and other local sources
the outlets with the highest total sales are chosen (mix of cut-off and quota sampling)
in 2007 prices are collected in about 40.000 outlets all over the national territory
consumer price index sampling methods
1. the optimal allocation of the available resources
Sampling of products
in 2007, 540 products are included in the CPI’s
consumer price index sampling methods
the selection of products is carried out by National Institute of Statistics (Istat)
the selection of the products - a list (basket) of products types with product type
specifications - is based on sales data
(cut-off sampling)
Italian background
1. the optimal allocation of the available resources
Sampling of elementary items in each outletwithin each outlet, the selection of elementary items is carried out by
MOS’s price collector
the most sold elementary items is chosen (the representative item
method)
in the 2007 about 400.000 price quotations are collected all over the national territory
consumer price index sampling methods
Italian background
1. the optimal allocation of the available resources
sample update
yearly base revision
consumer price index sampling methods
optimum sample allocation
current sizes of samples for elementary items are not optimal
Italian background
1. the optimal allocation of the available resources
the approach to variance estimation
1. the optimal allocation of the available resources
The Swedish approach has been used to estimate the variance of CPI (Dalén,
Ohlsson, 1995)
the sample is considered drawn from a two-dimensional population of products and
outlets
a cross-classified sample (CCS)
the approach to variance estimation1. the optimal allocation of the available resources
representative products – as rows (i)
outlets – as columns (j)
stratification into categories of products – stratum (g)
stratification into outlet groups – stratum (h)
the crossing of strata - cell (g,h)
the parameter (index) = I
parameter estimator (index) = Î
the approach to variance estimation1. the optimal allocation of the available resources
the general index (target parameter)
Vgh = weight for cell turnover for the category of products g traded in the outlets of group h
hgI = ∑ ∑ Igh Vgh
where the cell index is Igh = index cell
the approach to variance estimation
1. the optimal allocation of the available resources
wi = weight for representative product i
wh = weight for outlet j
lij = 1 if representative product i is traded in outlet j
lij = 0 otherwise
fij1 =pij
1
(pij0 + pij
1 ) / 2
fij0 =pij
0
(pij0 + pij
1 ) / 2
Igh =
lij wi wj fij1∑ ∑i j
lij wi wj fij0∑ ∑i j
Ygh
Xgh
= the cell index
where
the approach to variance estimation
1. the optimal allocation of the available resources
the estimated general index
hgI = ∑ ∑ Îgh Vgh
hgI = ∑ ∑ Îgh Vgh
^Îgh =
lij fij1∑ ∑i j
lij fij0∑ ∑i j
Ŷgh
Xgh
= the estimated cell index
1. the optimal allocation of the available resources
in CCS assumption the variance estimator can be decomposed into:
VPRO = variance between representative products
VOUT = variance between outlets
VINT = outlet and representative product interaction variance
V(Î)tot ~ VPRO + VOUT + VINT
where
the approach to variance estimation
1. the optimal allocation of the available resources
formulas for variance estimation
the approach to variance estimation
gh^e.j
h nh ( nh - 1)
1 (1 - πhj)VOUT = ∑j
∑g
∑^
^
vgh
Xgh
2
gh^ei.
g mg ( mg - 1)
1 (1 - πgi)VPRO = ∑i
∑h
∑^
^
vgh
Xgh
2
1. the optimal allocation of the available resources
with the following formula for variance estimation
the approach to variance estimation
g h mg ( mg - 1)
1VINT = ∑ ∑^
nh ( nh - 1)
1
^
vgh
Xgh 2
2
gh^e.j
∑ ∑I j
(1 - πhj)(1 - πgi)gh^
ei.
gh^eij
( - - )2x
where
mg
gh^ei. =
1
i
∑ eijgh^
nh
gh^e.j =
1
j
∑ eijgh^ eij = 1ij (fij – Ighfij )^ gh 1 0^
Case study1. the optimal allocation of the available resources
one geographical areaUdine county chief town(Resident population: 96.750)
one COICOP division (two-digit level)“Food and non alcoholic beverages”
reference periodDecember 2007
Case study1. the optimal allocation of the available resources
Outlets are divided into 12 strata according a commercial distribution type (reduced to 5 types for Food and non alcoholic beverages)
Representative products are divided into 52 strata according to the national nomenclature (categories of products)
Currently for outlets and products purposive sampling is used but a probability sampling
has been postulated for both
the approach to variance estimation
Case study
1. the optimal allocation of the available resources
Inclusion probabilities for representative products (πgi)
Inclusion probabilities for outlets (πhj)
Imputation by brands information in each strata
Imputation by the amount of representative products collected in
each outlet
the approach to variance estimation
Case study
1. the optimal allocation of the available resources
main numerical results the approach to variance estimation
Sample size = 2.373
Î (index) = 103.979569
Food and non alcoholic beverages Division
VPRO = 0.009466
VOUT = 0.000904
VINT = 0.000719
VTOT = 0.011090
95% confidence interval
1. the optimal allocation of the available resources
the cost function
one data collection method
Thus the following function cost is used
interviewers collect prices each month by visiting each outlet
1. the optimal allocation of the available resources
the approach to cost function estimation
C0 = fixed cost (i.e. for administration and other)
nh = the number of outlets into stratum h
mg = the number of products into stratum g
ah = fixed cost per outlet into stratum h (i.e. for travel time)
bh = cost to measuring one product in the outlets of stratum h
rgh = average relative frequency of products in stratum g sold in outlets of stratum h
h
C = C0 + ∑ nh ah + bh ∑ mgrgh g
where
1. the optimal allocation of the available resourcesthe allocation problem
County chief town: Udine
Resident population: 96.750
Reference time: December 2007
Food and non alcoholic beverages price quotes: 2.373
Food and non alcoholic beverages outlets: 43
C0 = not considered
ah = we consider the average travel time h
bh = we consider the average collecting time h
Estimate CTOT = 182 h.
Case study
Conclusion
1. the optimal allocation of the available resources
• Developing the contents of the paper solving the problem of nonlinear optimization deriving from the Cost and Variance formula
• Important news: preliminary attempt to estimate Italian CPI variance
• Enhancing effort to move towards a probability approach to CPI sampling
2. indicators to monitor data
collection process
2. indicators to monitor data collection processData collection: the net design
Istat CPI Office
Data server
DB Oracle
E-mail server
FTP server Web
server
Firewall
intranetintranet
Data collecto
r
PSTN or
UMTS
Data collecto
r
Data collecto
r
PSTN or
UMTS
Data collecto
r
2. indicators to monitor data collection process8
Different steps of data check and data quality indicators
1. Data collection software
2. UMTS data transmission for each outlet or data collection tour: first check and first data quality set of indicators on the web server (possible real time data in the outlet)
2. indicators to monitor data collection process8
3. Second check on the total amount of monthly elementary data and second data quality set of indicators (MOS)
4. Final check (the third one) on total amount of elementary data coming from all the chief towns (Istat) and third set of data quality indicators
5. Quarterly check concerning sampling
Different steps of data check and data quality indicators
2. indicators to monitor data collection process8
A completely integrated data production process where each event that will be stressed by the system of indicators will produce consequences in order to remove mistakes or their
possible causes
Different steps of data check and data quality indicators
Thank you for your attention
Federico Polidoro (Istat - Italy,
Rosabel Ricci (Istat - Italy, [email protected])
Anna Maria Sgamba (Istat - Italy,