k-nearest neighbor resampling technique (weather generation and water quality applications) balaji...

64
K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department of Civil, Environmental and Architectural Engineering University of Colorado Boulder, CO Denver Water February 2007

Upload: alexia-campbell

Post on 26-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

K-Nearest Neighbor Resampling Technique

(Weather Generation and Water Quality Applications)

Balaji Rajagopalan

Somkiat Apipattanavis & Erin TowlerDepartment of Civil, Environmental and

Architectural Engineering

University of Colorado

Boulder, CO

Denver Water

February 2007

Page 2: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

“Translation” of Climate Info

• Users most interested in sectoral outcomes (streamflows, crop yields, risk of disease X)

ClimateForecast /Projection

Forecast /ProjectionTranslation

ProcessModels

Distributionof Outcomes

Page 3: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Translation

28.5

………

12.4

23.1

………

10.2

29.1

………

11.4

25.8

………9.7

HistoricalData

Synthetic series

Process model

Frequency distribution of

outcomes

Page 4: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Why Simulation?• Limited historical data

– cannot capture the full range of variability– electing a (single or a set of ) historical years from the record – with

equal chance.Unconditional bootstrap, Index Sequential Method

• Need – tool to generate ‘scenarios’ that capture the historical statistical properties

• Several statistical techniques are available (e.g., time series techniques, Monte-carlo techniques etc.)

– These are cumbersome, restrictive (in their assumptions)

• Re-sampling techniques are simple and robust– Unconditional and Conditional bootstrap, K-nearest neighbor (K-NN)

bootstrap offer attractive alternatives.

Page 5: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Why Simulation?• Limited historical data

– cannot capture the full range of variability– electing a (single or a set of ) historical years from the record – with

equal chance.Unconditional bootstrap, Index Sequential Method

• Need – tool to generate ‘scenarios’ that capture the historical statistical properties

• Several statistical techniques are available (e.g., time series techniques, Monte-carlo techniques etc.)

– These are cumbersome, restrictive (in their assumptions)

• Re-sampling techniques are simple and robust– Unconditional and Conditional bootstrap, K-nearest neighbor (K-NN)

bootstrap offer attractive alternatives.

Page 6: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Re-sampling Techniques

• Drawing cards from a well shuffled deck– Selecting a (single or a set of ) historical years from the record –

with equal chance.Unconditional bootstrap, Index Sequential Method

• Drawing card from a biased deck– Selecting a (single or a set of) historical years with unequal

chance.E.g., selecting only El Nino years

Conditional bootstrap• K-Nearest Neighbor Bootstrap – “pattern matching”

– Select ‘K’ nearest neighbors (e.g., years) to the current ‘feature’– Select one of the K neighbors at random– Repeat to produce an ensemble–

Page 7: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Examples

• Ensemble Weather Generation– Scenario generation– Forecast

Argentina - Pampas Region

• Water Quality Modeling

(Boulder Water Utility)

Page 8: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Two Step Weather Generator

1 0 0 1 1 0 0 0 1 0 0 - - - - -

Probability of Dry and Wet Days

Dry day Wet day

0.60 (pd) 0.40 (pw)

  Transition Prob (pij)

  Dry day Wet day

Dry day 0.70 (pdd) 0.30 (pdw)

Wet day 0.80 (pwd) 0.20 (pww)

Generated Precipitation State time series

• Estimate Transition (wet to dry, etc.) Probabilities of the Markov Chain order-1 from historical data – for each month

• Generate Precipitation State time series using Markov Chain

• Suppose we need weather simulation for January 5th - January 4th is a wet day

• Get Neighbors from a 7-day window (7*50) centered on January 4th

• Screen days using the Precipitation state [(1,0), days in blue] – i.e., “Potential Neighbors”

• Calculate the distances between weather variables of current day feature vector and the potential neighbors

• Select the K-nearest neighbors • Assign them weights

Year   January           February  

  1234567 - - 11234 - -

1 20030200- - x x x x - -2 03200040- - x x x x - -3 30020300- - x x x x - -4 00600000- - x x x x - ----- - - - - - - - - - - - - - - - ----- - - - - - - - - - - - - - - - ----- - - - - - - - - - - - - - - - -0 02030023- - x x x x - -

• Pick a day from k-NN using the weight function – say, Jan 1st 1953

• The simulated weather for Jan 5th is Jan 2nd 1953.

• Repeat

k

jj

jijK

1

1

1

nk

Page 9: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Single Site Simulation

• Pergamino, Argentina– Daily weather variables 1931-2003

• Precipitation• Max. Temperature• Min. Temperature

• 100 simulations of 73 year length (as length of record)

• Statistics of simulated and historical data are compared

Page 10: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Spell Properties

Pergamino, Argentina

Page 11: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

wet and dry spell statistics

Page 12: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Moments (wet month - Jan)

Page 13: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Moments (dry month - July)

Page 14: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Conditional K-NN Re-sampling

• Conditioned on IRI seasonal forecast

• Get the prediction (A:N:B=40:35:25)

• Divide historical (seasonal) total into 3 tercile categories

• Bootstrap 40, 35 and 25 sample of historical years from wet, normal and dry categories

• Apply the two-step weather generator on this sample.

Page 15: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Conditional Weather Generation (results)

Page 16: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Multi-site extension

• Same procedure as single site is used but– Calculate the Average time series – “single site virtual

weather data” – Apply the two-step generator– Select the weather at all the locations on the picked

day – to obtain multi-site simulation

• Stations in Pampus region, Stations in Pampus region, Argentina Argentina

• PergaminoPergamino• JuninJunin• Nueve de JulioNueve de Julio

Page 17: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

wet and dry spell Statistics

Pergamino, Argentina

Multisite Case

Page 18: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Basic Distribution Properties

Page 19: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Spatial Correlation

Page 21: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

0 2 4 6 8 10

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

sw_avg

Pro

babi

lity

dens

ity fu

nctio

n0 2 4 6 8 10

0.0

0.1

0.2

0.3

0.4

sw_avg

Pro

babi

lity

dens

ity fu

nctio

n

Motivation

Dis

trib

utio

n

Dis

trib

utio

n

Input Output

Comply Non-Compliance

Uncertainty helps us to understand the risk of non-compliance with a given regulation

WTP

Page 22: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

• Monitoring effort mandated by USEPA

• Large public water systems

• Water quality and operating data

- Disinfection by-products (DBPs) and microorganisms to support rulemakings

• Most comprehensive view of large drinking water systems to date

Data SetInformation Collection Rule (ICR)

Page 23: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

• 18 months (Jul. 1997 – Dec. 1998)

• 458 continental US locations

Data Set

ICR

Page 24: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Data Set

• Water Quality – Influent

– Intermediate

– Finished

– Distribution system

• Chemical Additions

ICR Database

Page 25: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Influent water quality has significant variability due to

- climate

- geology

- water management

practices

Characterize Variability

Source Water

• TOC

• TSUVA

• Alkalinity

• pH

• Turbidity

• Temperature

• Total Hardness

Page 26: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

• Examine influent water quality for surface waters (SWs) – Spatial variability– Temporal variability

• Focus on total organic carbon (TOC)– TOC is a precursor in formation of DBPs– Methods extend to other water quality

parameters

Variability

Page 27: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Spatial Variability

Variability

• Local polynomial approach

• Find best K and P combination

• Contour estimates

),(_ LongitudeLatitudefTOC averageannual

Page 28: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Spatial Variability SW Average Annual TOC (mg/L)

Variability

2,30. P

Page 29: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Spatial Variability

Variability

Similar spatial patterns found for• Finished water TOC (lower)

• Distribution system DBPs– TTHM (total trihalomethanes)

– HAA5 (five haloacetic acids)

Page 30: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Spatial Variability

Variability

• Alkalinity

• Bromide

Spatial patterns consistent with previous research for other influent water quality variables

Page 31: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Variability

Temporal Variability

J F M A M J J A S O N D

Influ

ent T

OC

(m

g/L

)

0

1

2

3

4

2 4 6 8 10 12

01

23

4

1:12

TO

C[1

:12

]

J F M A M J J A S O N D

1998

City of Boulder’s Betasso Water Treatment Plant (CO)

Page 32: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Variability

Temporal Variability

• Some locations exhibited seasonal trends, others did not

• Month to month variations should be considered

Page 33: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

• Inherent variability in water quality contributes to uncertainty

• How can we quantify uncertainty?

Variability

Page 34: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Simulate “ensembles” of influent water quality (Monte Carlo)

Quantify Uncertainty

121 ... TOCTOC

Observed data

12_1001_100

12_11_1

...

.........

...

SS

SS

TOCTOC

TOCTOC

Ensembles

Page 35: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Normal

Lognormal

• Fit a probability density function (pdf) to the data-Normal, Lognormal, etc.

• Simulate from pdf

Quantify

Traditional Method

Page 36: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Limitations - What if the pdf is not a good fit?

- What if you don’t have enough data to make the pdf?

ex. 18 months/location in ICR database

Histogram of May

May

De

nsi

ty

1000 2000 3000 4000 5000 6000

0

e+

00

1

e-0

42

e

-04

3

e-0

44

e

-04

Quantify

Page 37: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

• Skip fitting a pdf to the data

• Simulate by bootstrapping• Randomly sample data with replacement

• Expand bootstrapping pool to include “similar” locations (nearest neighbors)

• What is limited in time is available in space

Space-Time Bootstrapping Method

Quantify

Page 38: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

• Find nearest neighbors (locations) in terms of a feature vector that includes variables of interest

• Feature vector includes:- Average Annual Concentration- Latitude

- Longitude

Quantify

),,( LonLatTOCtorFeatureVec average

Page 39: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Average annual concentration helps finds neighbors that are similar but may not be geographically nearby.

Average annual TOC (mg/L) for Ohio surface waters

Geographically close, but not good “neighbors” for bootstrapping

Quantify

Page 40: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Quantify

),,( LonLatTOCtorFeatureVec average

• Sample monthly TOC values based on feature vector• Conditional probability

)|( torFeatureVecTOCf monthly

Page 41: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Simulation Algorithm

user

user

user

user

Lon

Lat

TOC

x

mmm

iiiICR

LonLatTOC

LonLatTOC

LonLatTOC

x

.........

.........111

1) User inputs their location and their average annual TOC concentration

2) The ICR database is queried for all eligible entries

Quantify

Page 42: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

"" ICRuser xxd

Algorithm- cont.

3) Calculate distances, d, between the xuser vector and the xICR vector

Quantify

userx ICRx

Page 43: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Algorithm- cont.

3) Calculate distances using weighted Mahalanobis equation

Quantify

))(())(( _1

_ iICRuserTT

iICRuseri xxWSxxWd

Page 44: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

))(())(( _1

_ iICRuserTT

iICRuseri xxWSxxWd

Algorithm- cont.Quantify

Remove the weights (W) and the covariance matrix (S) and it’s Euclidean Distance

Page 45: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

))(())(( _1

_ iICRuserTT

iICRuseri xxWSxxWd

Algorithm- cont.Quantify

By including S, covariance matrix, components of the feature vector do not have to be scaled

(Davis 1986 )

Page 46: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Algorithm- cont.Quantify

))(())(( 1iuser

TTiuseri xxWSxxWd

Weights are assigned as

LonLatTOC WWWW

Page 47: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

100 LonLatTOC WWW 010 LonLatTOC WWW

001 LonLatTOC WWW 111 LonLatTOC WWW

Quantify

Weights offer flexibility in neighbor selection

(a) (b)

(c) (d)

Page 48: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

4) Obtain observed monthly data for each nearest neighbor

DeckJank

DeciJani

DecJan

NN

TOCTOC

TOCTOC

TOCTOC

x

__

__

_1_1

...

.........

...

.........

...

Algorithm- cont.Quantify

Page 49: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

5) Bootstrap xNN using a weight function

k

ii

jjp

1

1

1

Algorithm- cont.Quantify

Increases likelihood of picking nearer neighbors

Page 50: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Apply algorithm to quantify uncertainty in influent TOC concentrationCity of Boulder’s Betasso Water Treatment Plant (CO)

Boulder

SWs only, N = 334

Quantify

Page 51: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Red dot is the Boulder plant being simulated

Empty black dots are the “neighbors” to be bootstrapped

Identify nearest neighbors

- Include Boulder in pool for bootstrapping

111 LonLatTOC WWW

Quantify

Page 52: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

01

23

45

Influ

en

t T

OC

(m

g/L

)

J F M A M J J A S O N D Ann

Quantify

Box plot each monthly bootstrap ensemble (100 values)

Median

5th Percentile

95th Percentile

25th Percentile

75th Percentile

Outliers

Page 53: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Uncertainty quantified for Boulder

01

23

45

Influ

en

t TO

C (

mg

/L)

J F M A M J J A S O N D Ann

1998

Influ

ent T

OC

(m

g/L)

0

1

2

3

4

5

J F M A M J J A S O N D Ann

Quantify

• Simulates seasonal trends

• Provides rich variety of uncertainty

Page 54: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Overlay recent data

• Simulations capture recent data

01

23

45

TO

C (

mg

/L)

J F M A M J J A S O N D Ann

19971998200320042005

Influ

ent T

OC

(m

g/L)

0

1

2

3

4

5

J F M A M J J A S O N D Ann

Quantify

Page 55: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

City of Birmingham’s Carson Filter Plant (AL)

J F M A M J J A S O N D Ann

Influ

ent T

OC

(m

g/L)

0

1

2

3

4

2 4 6 8 10 12

01

23

4

Influ

en

t TO

C (

mg

/L)

J F M A M J J A S O N D Ann

1998

QuantifyPortable Across Locations

Page 56: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

City of Birmingham’s Carson Filter Plant (AL)

J F M A M J J A S O N D Ann

Influ

ent T

OC

(m

g/L)

0

1

2

3

4

QuantifyPortable Across Locations

2 4 6 8 10 12

01

23

4

Influ

en

t TO

C (

mg

/L)

J F M A M J J A S O N D Ann

01

23

4

Page 57: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

City of Birmingham’s Carson Filter Plant (AL)

J F M A M J J A S O N D Ann

Influ

ent T

OC

(m

g/L)

0

1

2

3

4

QuantifyPortable Across Locations

2 4 6 8 10 12

01

23

4

Influ

en

t TO

C (

mg

/L)

J F M A M J J A S O N D Ann

01

23

4 19971998200320042005

19971998200320042005

Page 58: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

J F M A M J J A S O N D Ann

Influ

ent A

lkal

inity

(as

mg/

L C

aCO

3)

0

10

2

0

30

4

0

50

60

70

2 4 6 8 10 12

01

02

03

04

05

06

07

0

z1

ob

s_1

99

8

J F M A M J J A S O N D Ann

New Jersey American Water Swimming River Treatment Plant (NJ)

QuantifyApplies to Other Variables

2 4 6 8 10 12

01

23

4

1:12

TO

C[1

:12

]

J F M A M J J A S O N D

1998

Page 59: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

J F M A M J J A S O N D Ann

Influ

ent A

lkal

inity

(as

mg/

L C

aCO

3)

0

10

2

0

30

4

0

50

60

70

New Jersey American Water Swimming River Treatment Plant (NJ)

QuantifyApplies to Other Variables

2 4 6 8 10 12

01

02

03

04

05

06

07

0

z1

ob

s_1

99

8

J F M A M J J A S O N D Ann

01

02

03

04

05

06

07

0

Page 60: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

J F M A M J J A S O N D Ann

Influ

ent A

lkal

inity

(as

mg/

L C

aCO

3)

0

10

2

0

30

4

0

50

60

70

New Jersey American Water Swimming River Treatment Plant (NJ)

QuantifyApplies to Other Variables

2 4 6 8 10 12

01

02

03

04

05

06

07

0

z1

ob

s_1

99

8

J F M A M J J A S O N D Ann

01

02

03

04

05

06

07

0

++

+ ++ +

++

++ + + +

+

199719982002200320042005

Page 61: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

• K-NN resampling technique provides a simple and robust alternative to generating ‘scenarios’.

– Quantify Uncertainty

– Ensemble forecast

• Very general – can be easily applied to a variety of situations.

Weather generation

Water Quality

Streamflow (Colorado River Basin)

Summary & Conclusions

Page 62: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

• Can readily be extended to generate ‘scenarios’ under climate change or decadal variability

modify the ‘feature vector’ to include the climate variability information

• Rajagopalan and Lall (1999); Yates et al. (2003), Apipattanavis et al. (2007) - all papers in Water Resources Research

[email protected]

Page 63: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

AwwaRF project 3115

“Decision Tool to Help Utilities Develop Simultaneos Compliance Strategies”

Utilities

City of Boulder’s Betasso Water Treatment Plant (CO)

City of Birmingham’s Carson Filter Plant (AL)

New Jersey American Water Swimming River Treatment Plant (NJ)

Greater Cincinnati (OH) Water Works Richard Miller Water Treatment Plant

Acknowledgements

Page 64: K-Nearest Neighbor Resampling Technique (Weather Generation and Water Quality Applications) Balaji Rajagopalan Somkiat Apipattanavis & Erin Towler Department

Questions

“It is better to be roughly right than precisely wrong.”

-John Maynard Keynes (1883-1946)