olga ivina phd thesis presentation short
DESCRIPTION
TRANSCRIPT
![Page 1: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/1.jpg)
Conformal prediction of air pollution concentrations forthe Barcelona Metropolitan Region
PhD Thesis summary
Olga Ivina
University of GironaGRECS research group
CIBER de Epidemiologıa y la Salud Publica
November 22, 2012
1 / 42
![Page 2: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/2.jpg)
Outline
IntroductionAir pollution and its effectsAir pollution exposure assessmentConformal predictors for air pollution problem
ObjectivesMethods and data
KrigingConformal predictorsComputingData
ResultsOrdinary kriging and RRCM models in default settingKernelisation: a Gaussian kernelKernelisation: other kernelsComparison of models
DiscussionConclusion
Conformal predictors and geostatisticsFuture research
2 / 42
![Page 3: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/3.jpg)
Air pollution and its effectsIntroduction
Air pollutant is a problem of growing concern all over the world.There exists great body of scientific evidence of hazardous effect of airpollution on people’s health and well-being, as well as on generalecological condition of our planet.
In people: association with adverse health outcomes - both in adults andin children. Children are specially susceptible to pollution. They getaffected from the very first stages of their lives and on. Linked outcomes(to name a few):
- preterm birth and low birth weight
- asthma aggravation, cough and bronchitis
- allergies: hay fever, rhinitis, ...
- excess risk of mortality
3 / 42
![Page 4: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/4.jpg)
Air pollution and its effects - 2Introduction
Adults are influenced by pollution as well. In them, pollution is linked toboth long-term and short-term health effects (to name a few):
- respiratory: COPD, asthma, chronic bronchitis
- lung cancer
- cardiovascular morbidity
- mortality: cancer, all-cause, cardiopulmonary, non-accidental,...
Special factors of impact: SES and geographical location of a person.
4 / 42
![Page 5: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/5.jpg)
Air pollution and its effects - 3Introduction
Global air pollution map produced by Envisat’s SCIAMACHY.
Authors: S. Beirle, U. Platt and T. Wagner, University of Heidelberg’s Institute for Environmental Physics.
5 / 42
![Page 6: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/6.jpg)
Air pollution and its effects - 4Introduction
The main contributor to air pollution in urban areas is traffic. Two -”criteria” - traffic-related air pollutants are taken up in this study:
- nitrogen dioxide (NO2)
- particulate matter PM10
NO2 effects:
short-term: respiratory effects and asthma aggravation
long-term: risk of coronary heart disease and fatal events
PM10 effects:
short-term: aggravation of respiratory and cardiovascular diseases,premature death, ...
long-term: development of heart and lung diseases, prematuredeath,...
6 / 42
![Page 7: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/7.jpg)
Air pollution exposure assessmentIntroduction
Problem: direct measurements of pollution not always available.
There exists a large number of models aimed t predict pollution at a givenspot. The main classes are:
- proximity models
- geostatistical models
- land use regression (LUR) models
- dispersion models
- integrated meteorological emission (IME) models
- hybrid models
7 / 42
![Page 8: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/8.jpg)
Conformal predictors for air pollution problemIntroduction
Problem: nowadays existing methods for air pollution exposureassessment may lack confidence in predictions.
In order to tackle this problem, this research suggests making use of anewly developed approach that is conformal predictors. A conformal
predictor is a “confidence predictor”, where the level of confidence forprediction is introduced ad hoc. This prediction is always valid - providedby definition of conformal predictor.
8 / 42
![Page 9: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/9.jpg)
Conformal predictors for air pollution problem - 2Introduction
A conformal predictor is defined by some nonconformity measure, and ithas two major desiderata:
- validity of predictions
- efficiency of preditions
Conformal predictors are flexible: they can be based upon almost anyunderlying statistical algorithm.
In air pollution modeling, if a regression-based algorithm is taken up, suchas LUR or kriging, regression residuals serve as a nonconformity measure.
9 / 42
![Page 10: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/10.jpg)
Objectives
This dissertation has two major objectives:
1 To demonstrate the capacity of conformal predictors as a method forspatial environmental modeling.
2 To provide valid estimates of nitrogen dioxide and fine particulatematter for Barcelona Metropolitan Region.
10 / 42
![Page 11: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/11.jpg)
KrigingMethods and data
Kriging is a spatial interpolation method. Provides a prediction of a factorof interest in an unobserved point on the basis of a set of observed points.Also provides an estimate of error variance (called “kriging variance”).
First introduced in 1951 by a South African engineer D.H. Krige in hismaster work devoted to estimation of a mineral ore body. The method hasbeen further developed: nowadays the notion “kriging” stands for asset ofmethods such as ordinary kriging, simple kriging, co-kriging, Bayesiankriging etc.
In its simples form, a kriging estimate of the data at an unobservedlocation is a linear combination of the observed data. The coefficients ofthe equation depend on spatial structure of the data and on the spatialcovariance.
11 / 42
![Page 12: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/12.jpg)
Kriging - 2Methods and data
The most common kriging is ordinary kriging. It is used when the meanof the second order stationary process is unknown. It is based on ageostatistical concept of variogram, and its approach - covariance function.Let there be n neighboring observed locations, x1, . . . , xn, and anunobserved location x0, on a spatial domain D. Let Z (x) : x ∈ D denotethe process, and let it have a variogram γ(h). Then the ordinary krigingestimate Z ∗OK (x0) at the unobserved point x0 will take the followinganalytical form:
Z ∗OK (x0) =n∑
α=1
ωαZ (xα), (1)
where ωα are the kriging weights. Ordinary kriging provides BLUEestimates of a random field, together with an error variance estimate(kriging variance.)
12 / 42
![Page 13: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/13.jpg)
New methods. Conformal predictorsMethods and data
How it works? Provided: pairs of observations of (xi , yi ) where xi is anobject and yi is a label. Then
Z := X× Y (2)
denotes the example space. Z is a measurable space. Given an incompletedata sequence (x1, y1), (x2, y2), . . . , (xn−1, yn−1) ∈ Z∗, the aim is to predicta label yn for an object xn. An operator:
D : Z∗ × X→ Y (3)
denotes then a simple predictor. (e.g., an ordinary kriging predictor).
13 / 42
![Page 14: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/14.jpg)
New methods. Conformal predictors - 2Methods and data
The prediction can be described as:
yn = D(x1, y1, x2, y2, . . . ; xn−1),Yn ∈ Y. (4)
Let us allow the predictor to output the prediction sets Yn large enough toprovide the confidence in prediction. This means, that the real value of ynwill fall in Yn with a given level of confidence, which is chosen andprovided to a predictor ad hoc.
A conformal predictor is a confidence predictor defined by somenonconformity measure. Given the measure, a conformal predictor outputsthe prediction set assuming that the new example conforms with theobserved ones.
14 / 42
![Page 15: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/15.jpg)
New methods. Conformal predictors - 3Methods and data
Ridge regression confidence machine (RRCM) is a regression-basedconformal predictor. It makes use of the ridge regression procedure (A. E.Hoerl, 1971) as an underlying algorithm.
Suppose Xn is the n × p matrix of objects (independent variables), and Yn
is the vector of labels (dependent variables). Then, a RRCM estimate ofparameters ω takes form:
ω = (X ′nXn + aIp)−1X ′nYn, (5)
where a is a ridge factor. a = 0 yields a standard least squares estimate.The nonconformity scores for this predictor are the regression residuals:|ei | := |yi − yi |.
15 / 42
![Page 16: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/16.jpg)
New methods. Conformal predictors - 4Methods and data
Based on a significance level for prediction introduced (roughly, aprobability of error not to exceed), a RRCM predictor outputs a set oflabels y for yn:
Si := {y : αi (y) ≥ αn(y)} = {y : |ai + biy | ≥ |an + bny |}, (6)
where ai and bi are the components of the vectors A and B.
RRCM outputs prediction sets instead of point predictions (what krigingdoes). These sets can be in form of a point, an interval, a ray, a union oftwo rays, the whole real line, or empty. Usually, it is an interval.
16 / 42
![Page 17: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/17.jpg)
New methods. Conformal predictors - 5Methods and data
When the number of parameters p is large, computation is hard. “Kerneltrick” is a method that helps deal with hight-dimensional data. It allows toconsider nonlinearity in RRCM.
A kernel is a similarity measure that operates in a feature space. Providedan input space X with a dot product, and an operator Φ that maps X to afeature space H:
Φ : X → H
x 7→ x := Φ(x)
a kernel will be defined as follows. For xα, xβ ∈ X :
k(xα, xβ) = 〈Φ(xα),Φ(xβ)〉 (7)
17 / 42
![Page 18: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/18.jpg)
New methods. Conformal predictors - 6Methods and data
Any conventional covariance function for kriging can be taken up asa kernel for RRCM. This research uses three (positive definite) kernels:
a dot product kernel (default)
a radial basis Gaussian kernel
an inhomogeneous polynomial kernel of a second degree
18 / 42
![Page 19: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/19.jpg)
ComputingMethods and data
All computational work made with R.
- Kriging: geoR package. Function krige.conv
- RRCM: PredictiveRegression package. Function iidpred.
- “Kernel trick”: self-developed (on the basis of the PredictiveRegressionpackage) functions for RRCM in “dual form” and for implementing thekernels.
19 / 42
![Page 20: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/20.jpg)
DataMethods and data
The data for this study has been kindly provided by XVPCA (Network forMonitoring and Forecasting of Air Pollution) of the Generalitat deCatalunya.
Mean annual concentrations of two criteria pollutants, NO2 and PM10, areprovided for the Barcelona Metropolitan Region, together with thegeographical coordinates of the monitoring stations(Mercator, UTM 31).
Time frames:
- NO2: 1998 - 2009, ex. 2003
- PM10: 2001 - 2009, ex.2003
20 / 42
![Page 21: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/21.jpg)
Data - 2Methods and data
49 monitoring stations over the area in total.
Barcelona Metropolitan Region has a territory of about 3200 km2 andaccommodates over 5 million inhabitants.
In BMR, there happen about 107 million displacements weekly, 54.1% ofthem - by means of motorized transport.
21 / 42
![Page 22: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/22.jpg)
Data - 3Methods and data
Table: 1. Data on mean annual nitrogen dioxide concentrations
Available observations for each year1998 1999 2000 2001 2002 2004 2005 2006 2007 2008 2009
24 25 25 25 25 24 22 24 25 25 24
Table: 2. Data on mean annual particulate matter concentrations
Available observations for each year2001 2002 2004 2005 2006 2007 2008 2009
22 24 28 28 29 30 33 36
22 / 42
![Page 23: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/23.jpg)
Data - 4Methods and data
Two major drawbacks, or limiting factors, of the data set:
Size: there was a small number of observations for each year andpollutant,
Distribution: the measurement spots are situated quite far apartfrom one another, and they are distributed, or placed, unevenly overthe geographic region.
Also, the data is the mean averages, and more frequent observations wereunavailable for this study.
23 / 42
![Page 24: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/24.jpg)
Ordinary kriging and RRCM modeling resultsResults
24 / 42
![Page 25: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/25.jpg)
Ordinary kriging and RRCM modeling results - 2Results
25 / 42
![Page 26: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/26.jpg)
Ordinary kriging and RRCM modeling results - 3Results
26 / 42
![Page 27: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/27.jpg)
Kernelisation: a Gaussian kernelResults
27 / 42
![Page 28: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/28.jpg)
Kernelisation: a Gaussian kernel - 2Results
28 / 42
![Page 29: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/29.jpg)
Kernelisation: a Gaussian kernel - 3Results
29 / 42
![Page 30: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/30.jpg)
Comparison of the RRCM modelsResults
30 / 42
![Page 31: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/31.jpg)
Comparison of the RRCM models - 2Results
31 / 42
![Page 32: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/32.jpg)
Comparison of the RRCM models - 3Results
Table: Comparison of models for different ridge factors (µg/m3)
linear iid RBF polynomialridge 0.01 1 2 0.01 1 2 0.01 1 22001 64.46 64.44 67.13 71.08 63.11 66.06 71.95 74.63 77.242002 43.43 42.46 45.54 47.41 42.91 45.05 50.44 53.17 55.822004 47.26 39.17 34.59 51.48 39.29 35.19 34.66 37.00 39.512005 39.65 45.14 49.28 35.50 47.60 51.91 51.44 54.76 57.762006 47.68 45.40 48.63 55.51 46.09 48.86 52.48 55.27 57.862007 91.43 94.02 96.45 85.40 94.09 96.65 99.83 102.11 104.292008 49.48 50.90 52.58 45.42 55.27 58.21 55.60 57.26 58.912009 28.42 27.32 29.01 29.16 26.11 27.79 32.26 33.67 35.09
32 / 42
![Page 33: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/33.jpg)
Comparison of the RRCM models - 4Results
33 / 42
![Page 34: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/34.jpg)
Comparison of the RRCM models - 5Results
34 / 42
![Page 35: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/35.jpg)
Comparison of the RRCM models - 6Results
Table: Comparison of models for different ridge factors (µg/m3)
linear iid RBF polynomialridge 0.01 1 2 0.01 1 2 0.01 1 21998 76.08 72.33 68.27 65.81 72.37 68.37 65.27 64.71 65.991999 66.31 60.11 61.44 67.68 60.57 60.39 65.32 68.20 70.872000 51.69 55.27 57.89 50.91 52.90 55.63 61.89 64.19 66.382001 36.25 41.30 44.90 35.32 38.65 42.36 49.54 52.34 54.952002 52.12 46.57 49.51 47.78 51.44 57.38 54.51 56.99 59.372004 53.65 59.11 62.46 53.89 56.95 60.41 67.06 69.36 71.602005 78.75 84.77 88.57 79.44 82.18 86.14 94.41 96.94 99.432006 61.79 66.39 69.78 61.24 63.82 67.38 74.90 77.36 79.762007 47.01 49.35 53.13 48.15 47.11 51.04 57.15 59.91 62.482008 46.96 50.15 53.58 47.45 48.04 51.55 57.63 60.21 62.632009 55.59 55.17 53.89 48.38 54.35 52.68 52.79 55.19 57.57
35 / 42
![Page 36: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/36.jpg)
Efficiency of predictionsDiscussion
Kriging predictions are smooth and vary little, also made for mean annualdata. Error estimates, however, are huge in case of nitrogen dioxide, andsmall in case of airborne particles - subject to properties of the substances:NO2 is known to have a generally larger variability than PM10.
Kriging intervals can be derived, assuming the Gaussianity of datadistribution. This assumption is common, but not always correct. RRCMmakes no assumption on data distribution, apart from being iid.
Two factors help boost the efficiency of RRCM prediction: kernels andridge factor. The least is chosen by the brute force method (or the methodof consecutive approximations).
36 / 42
![Page 37: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/37.jpg)
Conformal predictors and geostatisticsConclusion
Table: Comparison of OK and RRCM
OK RRCMpoint predictions prediction sets (usually intervals)
regression algorithm regression algorithm
Gaussianity assumption iid assumption
estimates error variance -
uses variogram and uses any appropriatecovariance function kernel
to approach it
- ridge factor
may lack confidence confidence level ischosen and guaranteed
37 / 42
![Page 38: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/38.jpg)
Future researchConclusion
Extend the existing data set for BMR
Provide additional validation for the methods
Test these models on the data for other cities
Develop conformal predictors on the basis of other popular airpollution exposure modeling algorithms (land use regression,dispersion models etc.)
38 / 42
![Page 39: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/39.jpg)
Selected references
V.Vovk, A.Gammerman, G.Shafer, Algorithmic learning in a randomworld, Springer (2005).
V.Vovk, I.Nouretdinov, A. Gammerman, On-line predictive linearregression, The Annals of Statistics (2009).
H. Wackernagel, Multivariate geostatistics: an introduction withapplications, Springer (2003).
B. Scholkopf, J. Smola, Learning with kernels: support vectormachines, regularization, optimization, and beyond, MIT Press(2002).
A. Lertxundi-Manterola, M. Saez, Modelling of nitrogen dioxide (NO2)and fine particulate matter (PM10) air pollution in the metropolitanareas of Barcelona and Bilbao, Spain, Environmetrics (2009).
39 / 42
![Page 40: Olga Ivina PhD thesis presentation short](https://reader033.vdocument.in/reader033/viewer/2022051323/54983d65b47959654d8b538a/html5/thumbnails/40.jpg)
Selected references - 2
A. Hoerl, R. Kennard, Ridge regression: Biased estimation fornonorthogonal problems, Technometrics 12.1 (1970).
P. Diggle, P. Ribeiro Jr., Model-Based Geostatistics, Springer (2007).
P. Ribeiro Jr., P. Diggle, geoR: a package for geostatistical analysis,R-NEWS 1.2 (2001).
N. Cressie, Statistics for spatial data, Wiley (1993).
M. Jerrett et al., A review and evaluation of intraurban air pollutionexposure models, Journal of exposure analysis and environmentalepidemiology (2005).
40 / 42