nairobi 1-2 october 2007 1 some approaches to agricultural statistics [email protected] notes 1....

19
Nairobi 1-2 October 2007 1 Some Approaches to Agricultural Some Approaches to Agricultural Statistics Statistics [email protected]

Upload: george-abner-robertson

Post on 14-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Nairobi 1-2 October 2007 1

Some Approaches to Agricultural StatisticsSome Approaches to Agricultural Statistics

[email protected]

Nairobi 1-2 October 2007 2

Main approaches to agricultural statistics (1)

Expert subjective estimations• In each administrative unit, a local expert fills a form with his

assessmentCensus • Farm census • General population census (households) List frame surveys• Statistical sampling

– Small administrative units can be used as first sampling stage • Selected farms (“purposive sampling”)

Nairobi 1-2 October 2007 3

Main approaches to agricultural statistics (2)

Area frame sampling • Observations on the ground

– Crop area– Yield

• Expert eye estimations• Objective measurements

• Remote sensed observations– Photo-interpretation– Classified images– Vegetation or yield indicators

Agro-meteorological models

Nairobi 1-2 October 2007 4

Expert subjective estimates

Advantages• Cheap if there is a network of agricultural experts• No sampling error• Easy to manage• All items can be addressed

– Crop area and yield– Livestock– Means of production

Disadvantages• No idea of the accuracy of data• Difficult to control the quality (non-sampling error), unless a sampling survey is

made• Changes are often underestimatedGenerally not recommended, • but in some situations it can be the only alternative. Can be used as covariables to improve the accuracy of a sample survey • Regression estimator or similar

Nairobi 1-2 October 2007 5

Farm census

Advantages• No sampling error• Detailed information on the farm structureDisadvantages• Expensive: In general it can be made only every 10 years.

– Heavy to manage in many countries. – Items that change every year are not included (e.g: crop area and yield)

• Only farms above a size threshold are included: bias (sometimes >10% of systematic underestimate for a recent census).

• Possible additional bias if farmers think that data can be used for tax purposes.Can be used as list frame for sample surveys

Nairobi 1-2 October 2007 6

Population census (agricultural holdings)

Subset of the population census: holdings with some agricultural activity. Compared to farm census:Advantages• Agricultural and non-agricultural activity (income) can be analysed together• Smaller bias (part-time farming included)Disadvantages• Farm structure is more difficult to analyse• More problematic to use as list frame for sampling surveys

Nairobi 1-2 October 2007 7

List frame surveys from census

A statistical sample is selected in the census (farms or households) Advantages• Flexible: general or specialized surveys possible. • Stratification can be very efficient if farms have heterogeneous sizes or they are

specialised in different productions. • Often smaller bias than census (quality control is easier)Disadvantages• Bias can be important if:

– Census is incomplete or not updated– Answers of farmers are not fully reliable

Nairobi 1-2 October 2007 8

Two-stage list frame if census is unavailable

A statistical sample of (small) administrative units is selectedA “mini-census” is made in each of the selected administrative unitsA statistical sample is selected in each of the mini-censuses Advantages compared to list frame survey on a census• Easier to update the sampling frame (smaller bias)Disadvantages• Less flexible than sampling on a proper census• Less efficient stratification

Nairobi 1-2 October 2007 9

Purposive sampling of farms

A set of farms is selected without a proper statistical method. Advantages• Provides an emergency solution if a proper statistical method is not applicable

– Crisis situations• Avoids high rates of non-response

– Data difficult to provide or sensitive (accountancy)Disadvantages• Requires a good knowledge of covariables in the population for extrapolation• Very high risk of biasMay produce acceptable results on the inter-annual change rates.

Nairobi 1-2 October 2007 10

Area Frame Sampling

The sampling frame is not a list of farms, but the geographic space divided into sampling units:

• Segments: portions of territory, generally 9 ha – 400 ha. – Physical boundaries: segments are delimited by roads, rivers or field limits– Geometric shape: squares….

• Points: in practice a “point” is conceived as a piece of land (3 x 3 m)• Transects: straight lines of a given length

– Often used for environmental observations. Observation mode: • Direct on the ground: crop, yield estimation…. • Interview with the farmers who manage the selected fields Sampling techniques: • Random or systematic• Clustered or unclustered• Stratified or non-stratified, • ……

Nairobi 1-2 October 2007 11

Area Frame Sampling (2)

Advantages • The sampling frame coincides quite precisely with the population

– No (few) missing elements in the frame– No repeated elements

• Sampling units easy to define – Except for segments with physical boundaries

• Objective (if direct observations)• Can be combined with remote sensing for further improvementDrawbacks • For direct observation, the date of the field visit can be critical. Problems appear if

– Crop not yet emerged– Already harvested and insufficient traces left– Not clear if it will be harvested

• Locating the units (segments or points) requires reliable field survey material– Aerial photographs with a proper enlargement – GPS

• Daily access to a reliable power supply• Technical ability to operate the device• Sometimes limitations due to security-military concerns

Nairobi 1-2 October 2007 12

Yield observations on the ground (1)

Expert eye estimates on a statistical sample • Possibly with the help of a table: number of ears per m2, number of

grains per ear, size of the grains…Advantages • Cheap• Possibility of providing geo-referenced data to combine with satellite

images• Good results if the experts are reliableDrawbacks • Difficult to assess accuracy.• Possible strong bias. (interest to have higher/lower estimates:

aids..)Need of quality control

Nairobi 1-2 October 2007 13

Yield observations on the ground (2)

Objective measurements. • The crop is cut in a square of a given size (e.g.: 1 m2)• Precise weighing in laboratory.Advantages • Statistical error can be computed• Possibility of providing geo-referenced data to combine with satellite

images• Good results if the enumerators are meticulousDrawbacks • Difficult to be precise in the application of the rules of sample

collection.– Enumerators tend to avoid parts of the field with lower yield

• Possible bias (over-estimation), even applying coefficients for harvest loss.

Nairobi 1-2 October 2007 14

Remote sensed observations (satellite images)

Can provide information on area or yield• As covariable: combined with a consistent ground survey For area estimation• Should not be used to substitute ground survey, except in

particular cases: – Conflict (dangerous to go to the field)– No authorization (illegal crops, North Korea,…)– Very high accuracy in the identification of crops (>95%).

• E.g: large fields of rice

For yield estimation • Vegetation indexes in arid regions give good indications on inter-annual

change• Co-variable to be combined with geo-referenced measurements

Nairobi 1-2 October 2007 15

Remote sensing: cost-efficiency assessment

The accuracy of a (normal) ground survey + remote sensing = accuracy of a more intense ground survey.

• Which option is cheaper?• Elements needed to assess:

– Cost structure of the of the ground survey• Fixed cost• Cost per additional Primary Sampling Unit (Administrative unit?) in the

sample• Cost per additional Secondary Sampling Units (farms, segments, points,

fields)

– Cost of remote sensing• Images• Image processing• Combining remote sensing with ground data

Nairobi 1-2 October 2007 16

Agro-meteorological models

Require a relatively complex information • Soil map (water capacity)• Phenological calendar (planting, flowering..)• Coefficients describing the physiology of the plant. • “Clean” meteorological data in nearly-real-time (10 days..)• But simplified versions are also possible. For yield estimates, results need to be combined with historical

statistical data • Historical results of the agro-met model needed (long process)Inter-annual yield change indicators can be good without historical

statistical data• Geographical analysis of areas of concern• Possibility to combine with coarse resolution satellite images (vegetation

indexes….)

Nairobi 1-2 October 2007 17

Some vocabulary

Bias and standard errorBias ~ non-sampling error• Systematic tendency • No reduction with a higher amount of data

– Cannot be removed with an exhaustive census, classified image of the whole territory, etc

• Usually difficult to compute – In general no formula available

Standard error ~ sampling error• Due to the randomness of the sample • Decreases when the sample grows• In general formulas are available

– Sometimes very complicated: simulation possible (bootstrap)

Nairobi 1-2 October 2007 18

Variables and co-variables

Use of these terms in the context of this presentation (and often in sampling survey techniques)

We have a targeted result (e.g. Crop Area)Variable (or main variable): usually refers to a magnitude that (nearly) coincides

conceptually with the targeted result (direct observation)• measured on a sample of units

– farms,– households, – small administrative units, – territorial segments, – points, – fields

• Measurements nearly unbiasedCo-variable: usually refers to more biased measurements known for the whole

population or a very large sample• Subjective estimates• Classified images• Vegetation indexes.

Nairobi 1-2 October 2007 19

Variables and co-variables (2)

Variables and co-variables can be combined • Regression estimators and similar (difference, ratio…)• Calibration estimators• Small area estimatorsIf the main variable is (nearly) unbiased, the combined estimator is (nearly) unbiased• Even if the covariable is biased• If the variable and the co-variable are well correlated, the combined estimator has a

smaller standard error– But the gain is limited– It is important that the estimator based on the main variable has a decent standard

error – Good ground or farm survey.– Quality control

• When combining a variable (known for a sample) and a covariable (exhaustive knowledge), it is important that the covariable has the same quality in the sample and out of the sample

– Do not improve the co-variable in the sample if you cannot improve it out of the sample.