gap-filling and fault-detection for the life under your feet dataset

50
Gap-filling and Fault-detection for the life under your feet dataset

Upload: dreama

Post on 14-Jan-2016

19 views

Category:

Documents


1 download

DESCRIPTION

Gap-filling and Fault-detection for the life under your feet dataset. Locations. Soil Temperature Dataset. Observations. Data is Correlated in time and space Evolving over time (seasons) Gappy (Due to failures) Faulty (Noise, Jumps in values). Examples of Faults. Motivation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Gap-filling and Fault-detection  for the  life under your feet dataset

Gap-filling and Fault-detection for the

life under your feet dataset

Page 2: Gap-filling and Fault-detection  for the  life under your feet dataset

Locations

Page 3: Gap-filling and Fault-detection  for the  life under your feet dataset

Soil Temperature Dataset

Page 4: Gap-filling and Fault-detection  for the  life under your feet dataset

Observations

• Data is – Correlated in time and space– Evolving over time (seasons)– Gappy (Due to failures)– Faulty (Noise, Jumps in values)

Page 5: Gap-filling and Fault-detection  for the  life under your feet dataset

Examples of Faults

Page 6: Gap-filling and Fault-detection  for the  life under your feet dataset

Motivation

• Environmental Scientists want gap-filled and fault-free data

• 8% of the data is missing

• Of the available data, 10% is faulty

• Scientific libraries such as covariance, SVD etc do not handle gaps and faults well

Page 7: Gap-filling and Fault-detection  for the  life under your feet dataset

Basic Observations

• Hardware failures causes entire “days” of data to be missing– Temporal correlation for integrity is less effective

• If we look in the spatial domain,– For a given time, at least > 50% of the sensors are

active– Use the spatial basis instead.– Initialize spatial basis using the period marked

“good period” in previous slide

Page 8: Gap-filling and Fault-detection  for the  life under your feet dataset

Basic Methods

Page 9: Gap-filling and Fault-detection  for the  life under your feet dataset

The Johns Hopkins University

Principal Component Analysis (PCA)

PCA :- Finds axes of maximum

variance

- Reduces original dimensionality

(In e.g. from 2 variables => 1 variable)

First Principal Component

Variable #1

Varia

ble

#2

X : Points original spaceO : Projection on PC1

Page 10: Gap-filling and Fault-detection  for the  life under your feet dataset

Gap Filling (Connolly & Szalay)+

Original signal representation as a linear combination of orthogonal vectors

Optimization function- Wλ = 0 if data is absent- Find coefficients, ai’s, that minimize function.

+ Connolly et al., A Robust Classification of Galaxy Spectra: Dealing with Noisy and Incomplete Data, http://arxiv.org/PS_cache/astro-ph/pdf/9901/9901300v1.pdf

Page 11: Gap-filling and Fault-detection  for the  life under your feet dataset

Initialization

Good Period

Page 12: Gap-filling and Fault-detection  for the  life under your feet dataset

Naïve Gap-Filling

Page 13: Gap-filling and Fault-detection  for the  life under your feet dataset

Observations & Incremental Robust PCA *• Faulty Data

– Undesirable– Pollutes the PCA model– Impacts the quality of gap-filling

• Basis is time-dependent– Correlations are changing over time

• Simultaneous fault-detection & gap-filling– Initialize basis using reliable data– Using basis at time t,

• Detect and remove faulty readings• Weight new vectors depending on residuals• Use weights to update bases• Gap fill using the current basis• Update thresholds for declaring faults

– Iterate over data to improve estimates+ Budavári, T., Wild, V., Szalay, A. S., Dobos, L., Yip, C.-W. 2009 Monthly Notices of the Royal Astronomical Society, 394, 1496, http://adsabs.harvard.edu/abs/2009MNRAS.394.1496B

Page 14: Gap-filling and Fault-detection  for the  life under your feet dataset

Iterative and Robust gap-filling

Page 15: Gap-filling and Fault-detection  for the  life under your feet dataset

Example 1

Page 16: Gap-filling and Fault-detection  for the  life under your feet dataset

Example 2

Page 17: Gap-filling and Fault-detection  for the  life under your feet dataset

Erro

r Dis

trib

ution

in L

ocati

on

Page 18: Gap-filling and Fault-detection  for the  life under your feet dataset

Erro

r Dis

trib

ution

in T

ime

Page 19: Gap-filling and Fault-detection  for the  life under your feet dataset

Impact of number of missing values

Even when > 50% of the values are missingReconstruction is fairly good

Page 20: Gap-filling and Fault-detection  for the  life under your feet dataset

Erro

r Dis

trib

ution

IDW

(Loc

ation

)

Page 21: Gap-filling and Fault-detection  for the  life under your feet dataset

Erro

r Dis

trib

ution

IDW

(tim

e)

Page 22: Gap-filling and Fault-detection  for the  life under your feet dataset

Discussion

• Reception of this work by sensor network community

• Compare with other approaches– K-nn interpolation– Other model based methods such as BBQ

• Reconstruction depends– Number of missing values– Estimation of basis (seasonal effects)– Use the tradeoff between reconstruction accuracy and

power savings

Page 23: Gap-filling and Fault-detection  for the  life under your feet dataset

original

Page 24: Gap-filling and Fault-detection  for the  life under your feet dataset

Iterative and Robust gap-filling

Page 25: Gap-filling and Fault-detection  for the  life under your feet dataset

Smooth / Original

Page 26: Gap-filling and Fault-detection  for the  life under your feet dataset

Original and Ratio

Page 27: Gap-filling and Fault-detection  for the  life under your feet dataset

Soil Moisture and Ratio

Page 28: Gap-filling and Fault-detection  for the  life under your feet dataset

Moisture vs Ratio (zoom in)

Page 29: Gap-filling and Fault-detection  for the  life under your feet dataset

Statistics

Page 30: Gap-filling and Fault-detection  for the  life under your feet dataset

Locations and Durations

Page 31: Gap-filling and Fault-detection  for the  life under your feet dataset

Olin

Page 32: Gap-filling and Fault-detection  for the  life under your feet dataset

Principal Components

Page 33: Gap-filling and Fault-detection  for the  life under your feet dataset

original

Page 34: Gap-filling and Fault-detection  for the  life under your feet dataset

Slope and Intercept

Page 35: Gap-filling and Fault-detection  for the  life under your feet dataset

PC1 and PC2

Page 36: Gap-filling and Fault-detection  for the  life under your feet dataset

Forest Grass Winter

Page 37: Gap-filling and Fault-detection  for the  life under your feet dataset

Forest Grass Summer

Page 38: Gap-filling and Fault-detection  for the  life under your feet dataset

Mean scatter good locations

Page 39: Gap-filling and Fault-detection  for the  life under your feet dataset

Mean scatter 1 bad location

Page 40: Gap-filling and Fault-detection  for the  life under your feet dataset

2st component good scatter

Page 41: Gap-filling and Fault-detection  for the  life under your feet dataset

2nd component bad scatter

Page 42: Gap-filling and Fault-detection  for the  life under your feet dataset

Extra

Page 43: Gap-filling and Fault-detection  for the  life under your feet dataset

Soil

Moi

stur

e Er

ror (

Loca

tion)

Page 44: Gap-filling and Fault-detection  for the  life under your feet dataset

Soil

Moi

stur

e in

Tim

e

Page 45: Gap-filling and Fault-detection  for the  life under your feet dataset

Original Vs Fault Removed

Page 46: Gap-filling and Fault-detection  for the  life under your feet dataset

Example of running algorithm

Page 47: Gap-filling and Fault-detection  for the  life under your feet dataset

Daily Basis

Page 48: Gap-filling and Fault-detection  for the  life under your feet dataset

Collected measurements

Page 49: Gap-filling and Fault-detection  for the  life under your feet dataset

Temporal Gap Filling

• Data is correlated in the temporal domain

• Data shows– Diurnal patterns– Trends

• Trends are modeled using slope and intercept

• Correlations captured using functional PCA

Page 50: Gap-filling and Fault-detection  for the  life under your feet dataset

Accuracy of gap-filling

High Errors for some locations are actually due to faults Not necessarily due to the gap-filling method