four methods of estimating pm 2.5 annual averages

56
FOUR METHODS OF ESTIMATING FOUR METHODS OF ESTIMATING PM PM 2.5 2.5 ANNUAL AVERAGES ANNUAL AVERAGES Yan Liu and Amy Nail Department of Statistics North Carolina State University EPA Office of Air Quality, Planning, and Standards Emissions Monitoring, and Analysis Division

Upload: donat

Post on 06-Jan-2016

17 views

Category:

Documents


1 download

DESCRIPTION

FOUR METHODS OF ESTIMATING PM 2.5 ANNUAL AVERAGES. Yan Liu and Amy Nail Department of Statistics North Carolina State University EPA Office of Air Quality, Planning, and Standards Emissions Monitoring, and Analysis Division. Project Objectives. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

FOUR METHODS OF ESTIMATING FOUR METHODS OF ESTIMATING PMPM2.52.5 ANNUAL AVERAGES ANNUAL AVERAGES

Yan Liu and Amy NailDepartment of StatisticsNorth Carolina State UniversityEPA Office of Air Quality, Planning, and StandardsEmissions Monitoring, and Analysis Division

Page 2: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Project ObjectivesProject Objectives

Estimation of annual average of PM2.5 concentration Estimation of standard errors associated with annual average

estimates Estimation of the probability that a site’s annual average

exceeds 15 mg/m3 At 2400 lattice points for 2000, 2001 Comparisons of 4 different methodologies:

1. Quarter-based analysis (Yan)2. Annual-based analysis (Yan) Daily-based analyses:3. Doug Nychka’s method (Bill)4. Generalized least squares in SAS Proc Mixed (Amy)

Page 3: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Why are Standard Errors Important?Why are Standard Errors Important?

We may estimate that the annual average for lattice point 329 is 16 mg/m3, which exceeds the standard of 15. But since our estimate has some uncertainty or standard error, we’d like to take this uncertainty into account in order to determine the probability that lattice point 329 exceeds 15.

Page 4: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

In addition to maps like this ...In addition to maps like this ...

Page 5: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

……we also want maps like this.we also want maps like this.

Note: This Map is WRONG--so don’t show it to anyone! We haven’t figured out the correct way to determine errors, so we cannot correctly draw a probability map yet.

Page 6: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Data DescriptionData Description

Concentrations of PM2.5 measured during 2000, 2001

The domain analyzed: the portion of the U.S. east of –100o longitude

Concentrations measured every third day

Page 7: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Map of 2400 Lattice PointsMap of 2400 Lattice Points

Page 8: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Method 1 – Quarterly AnalysisMethod 1 – Quarterly Analysis

3 months in each quarter • Q1(Jan. - Mar.) Q2(Apr. - Jun.)• Q3(Jul. - Sep.) Q4(Oct. - Dec.)

Within quarters, 75% completeness Found quarter mean conc. at each site For each quarter, kriged mean conc. over

lattice Averaged the quarter predictions to get

annual average estimate

Page 9: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 10: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 11: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 12: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 13: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Annual Average PredictionsAnnual Average Predictions

Page 14: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Method 2 – Annual AnalysisMethod 2 – Annual Analysis

Used sites common to all 4 quarters in quarterly analysis

Found annual mean conc. at each site

Kriged annual mean conc. over lattice

Page 15: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 16: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

The Number of SitesThe Number of Sites

2000 2001

Quarter 1 510 631

Quarter 2 575 642

Quarter 3 619 682

Quarter 4 613 666

Annual 394 517

Page 17: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Model for Quarterly and Annual AnalysesModel for Quarterly and Annual Analyses

Predicted value =

quadratic surface prediction (SP)

+

error prediction (KP)

Page 18: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Estimating Quadratic Surface Estimating Quadratic Surface

Model: Conc = 0 + 1lat + 2lon + 3lat2 + 4lon2 + 5lat * lon +

Assume: 1) E() = 0, Var() = 2 I 2) The betas are estimated by SAS assuming errors iid

Fit parameters using ordinary least squares in

SAS proc reg

Obtained surface predictions (SP) and their standard errors (SEsp) and the ’s

Page 19: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Kriging the Error SurfaceKriging the Error Surface

Model: {(s) : s R2} E((s) )

= 0 Var((s) - (s’) ) = 0 if s=s’ 2

n + 2(1- e-dist/) if ss’

Estimated variogram parameters using nonlinear least squares in Splus

Obtained kriging predictions (KP) and their standard errors (SEkp)

Page 20: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Variogram ModelsVariogram Models

3 commonly used variogram models:– Exponential

(h)=1 – exp (-3h/a)

– Spherical (h)=1.5 • (h/a) - 0.5 • (h/a)3 if h a (h)=1 otherwise

– Gaussian (h)=1 - exp (-3h2 /a2)

a: range

h: distanceSpherical model

Exponential model

Gaussian model

range

sill

h

(h)

Page 21: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Cross Validation to Select Variogram ModelCross Validation to Select Variogram Model

Idea: temporarily remove the sample value at a particular location one at a time, estimate this value from remaining data using the different variogram models.

Prediction error = observed - predicted

MSE = 1/(n-1) (prediction error)2

Page 22: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

2000

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Q1 Q2 Q3 Q4 annual.real

MSE

2001

0

0.5

1

1.5

2

2.5

3

3.5

4

Q1 Q2 Q3 Q4 annual.real

Exponential

Gaussian

Spherical

Cross Validation MSE for Three Variogram ModelsCross Validation MSE for Three Variogram Models

• Exponential model has the least MSE.• Conclusion: use Exponential model

Page 23: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Calculating Predicted Annual AveragesCalculating Predicted Annual Averages

Quarter averages:

PQi = SPQi + KPQi

Annual average from quarterly analysis:

Pannual = ( PQi) / 4

Annual average from annual analysis:

Pannual = SPannual + KPannual

i=1

4

Page 24: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 25: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Calculation of Standard Error for Calculation of Standard Error for Annual AveragesAnnual Averages Standard errors of quarterly averages:

SEQi = (SEspi)2 + (SEkpi)2

Standard errors of annual averages from quarterly analysis:

SEannual = 1/16 (SEQi)2

Standard errors of annual averages from annual analysis:

SEannual = (SEsp)2 + (SEkp)2

Page 26: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Sources of ErrorSources of Error

• Less than 5% of total errors is coming from fitting a quadratic surface. • Kriging prediction error dominates.

2001

0

0.5

1

1.5

2

2.5

Surface

Kriging

Total

2000

0

0.5

1

1.5

2

2.5

Pre

dict

ion

Sta

ndar

d E

rror

Page 27: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 28: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 29: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 30: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Problems With Quarterly & Annual AnalysisProblems With Quarterly & Annual Analysis

The surface prediction and kriging prediction are not independent.

Var (SP + KP) Var (SP) + Var (KP)

surface prediction

krig

ing

pred

ictio

n

0 5 10 15

-20

24

Annual 2000 SP vs. KP

surface prediction

krig

ing

pred

ictio

n

4 6 8 10 12

-2-1

01

23

4

Annual 2001 SP vs. KP

Page 31: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

surface prediction

krig

ing

pre

dic

tion

10 12 14 16

-20

24

2000 Quarter 1 SP vs. KP

surface prediction

krig

ing

pre

dic

tion

0 5 10 15

-20

24

2000 Quarter 2 SP vs. KP

surface prediction

krig

ing

pre

dic

tion

5 10 15

-20

24

2000 Quarter 3 SP vs. KP

surface prediction

krig

ing

pre

dic

tion

0 5 10 15

-4-2

02

46

2000 Quarter 4 SP vs. KP

Page 32: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

surface prediction

krig

ing

pred

ictio

n

8 10 12 14 16

-6-4

-20

24

2001 Quarter 1 SP vs. KP

surface prediction

krig

ing

pred

ictio

n

2 4 6 8 10 12 14 16

-4-2

02

4

2001 Quarter 2 SP vs. KP

surface prediction

krig

ing

pred

ictio

n

0 5 10 15

-4-2

02

4

2001 Quarter 3 SP vs. KP

surface prediction

krig

ing

pred

ictio

n

4 6 8 10 12

-20

2

2001 Quarter 4 SP vs. KP

Page 33: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

More Problems With Quarterly and Annual AnalysisMore Problems With Quarterly and Annual Analysis

Not using all available data

When kriging residuals, estimated variogram is biased low (Kim and Boos 2002) (This problem could be solved by using generalized least squares.)

Ignored standard deviation of annual and/or quarterly averages in calculation of kriging prediction error

Quarterly averages may not be independent

Page 34: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Methods 3 & 4 - Daily-BasedMethods 3 & 4 - Daily-Based

Used every third day data (122 days per year)

Kriged each day to obtain predictions at 2400 lattice points

At each lattice point fit a timeseries to the 122 days’ estimates to estimate annual average

Calculated timeseries error for annual average using proc arima

Page 35: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Method 3 - “Doug’s Method”Method 3 - “Doug’s Method” Fit a quadratic surface using the Krig function in Splus Used an algorithm that minimizes generalized cross

validation error in order to estimate all parameters--including both quadratic surface parameters and covariance parameters

Did not assume errors iid when fitting quad surf, so coefficients in quad surf estimated based on cov structure

Specified an exponential covariance structure with a nugget

Provided the fixed value of 200 km for range parameter for all 122 days

Page 36: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Method 4 - “Amy’s Method”Method 4 - “Amy’s Method”

Fit a quadratic surface using Generalized Least Squares in SAS Proc Mixed

Restricted (or residual) Maximum Likelihood used to estimate all parameters

Did not assume errors iid when fitting quad surf, so coefficients in quad surf estimated based on cov structure

Specified an exponential covariance structure with a nugget

Estimated each parameter each day

Page 37: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 38: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 39: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 40: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 41: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 42: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Problems with Doug’s MethodProblems with Doug’s Method

Using the same value for range parameter every day requires assumption that the range parameter is constant over time. Not a valid assumption. Amy’s method does not make this assumption.

Ignored kriging prediction error in calculation of timeseries error for annual average.

Page 43: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Problems with Amy’s MethodProblems with Amy’s Method

REML assumes data for each day is normally distributed. It isn’t. Can fix by using a transformation, but must be careful not to introduce bias in back-transform. There is an unbiased back-transform predictor and an associated estimate of error in Cressie section 3.2.2. Also must decide whether to transform each day using the same function. Doug’s method does not require normality assumption.

Ignored kriging prediction error in calculation of timeseries error for annual average.

Page 44: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

What if we “propagate” errors?What if we “propagate” errors?

At a given lattice point we have 122 days’ worth of predictions, each with a kriging prediction error. What if we treat the 122 days as independent observations (they aren’t, they are AR1) and combine the errors accordingly? And we do this for each of our 2400 lattice points.

Page 45: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 46: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 47: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

The Big ProblemThe Big Problem

None of our standard error estimates are correct!

They are all underestimates!We need to learn how to put spatial

error components together with temporal error components.

Page 48: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Model for one dayModel for one day

Yij = o + 1i + 2i2 + 3j + 4j2 + 5ij + ij

Where i = lattitude j = longitudeE(ij) = 0

Cov(ij, I’j’) = 2n + 2e-dist/ i=i’and j=j’

2e-dist/ ii’ or jj’

Page 49: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Model for one siteModel for one site

Yk = + (Yk-1- ) + ek k = 1,…,122

Where E(ek) = 0

Var (ek) = 2

Note: this is an AR1 model. The errors are iid (0, 2) because the temporal correlation is accounted for using the (Yk-1- ) term.

Page 50: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Model for all sites and days?Model for all sites and days?

Yijk = o,k + 1,ki + 2,ki2 + 3,kj + 4,kj2 + 5,kij + ijk

+ eijk

Where E(ijk ) = 0, E(eijk) = 0

We’ve assumed isotropy and stationarity for simplicity.

But how do we model Cov(ijk, I’j’k’), Cov(eijk, ei’j’k’), and Cov (ijk, ei’j’k’)?

Page 51: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

SeparabilitySeparability

We’ve been treating the covariance structure as separable--meaning that the 1-D temporal and 2-D spatial covariance structures can be estimated separately and then can be mathematically combined to obtain a 3-D space-time covariance structure. We need to test for separability, and if the covariance components are separable, we need to appropriately combine them. We are just now learning how to do this.

Page 52: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

Next Steps….Next Steps….

Re-do Quarterly and Annual analyses using generalized least squares

Perform Amy’s analysis using transformations, making sure to use an unbiased estimator in the back-transform and the appropriate error estimator. How much does the lack of normality in the original analysis affect results?

Page 53: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

More next steps….More next steps….

Investigate the separability of the covariance structure and the correct method for combining space and time covariance components.

Attempt a 3-dimensional kriging. No assumption of separability is required to do this. We must, however, write our own code for this project because there is no software package (to our knowledge) that performs such an analysis. This method would allow us to use even more data than we are using now, as we would not be restricted to every third day.

Page 54: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES

That’s all, folks!That’s all, folks!

Page 55: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES
Page 56: FOUR METHODS OF ESTIMATING PM 2.5  ANNUAL AVERAGES