approximating sensor network queries using in-network summaries

29
Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Upload: berg

Post on 23-Feb-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Approximating Sensor Network Queries Using In-Network Summaries. Alexandra Meliou Carlos Guestrin Joseph Hellerstein. Approximate Answer Queries. Approximate representation of the world: Discrete locations Lossy communication Noisy measurements - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Approximating Sensor Network Queries Using In-Network Summaries

Approximating Sensor Network Queries Using In-Network

Summaries

Alexandra MeliouCarlos GuestrinJoseph Hellerstein

Page 2: Approximating Sensor Network Queries Using In-Network Summaries

Approximate Answer Queries Approximate representation of the world:

Discrete locations Lossy communication Noisy measurements

Applications do not expect accurate values (tolerance to noise)

Example: Return the temperature at all locations ±1C, with 95% confidence

Query Satisfaction: On expectation the requested portion of sensor values lies within the

error range

Page 3: Approximating Sensor Network Queries Using In-Network Summaries

In-network DecisionsQuery

Use in-network models to make routing decisions

No centralized planning

Page 4: Approximating Sensor Network Queries Using In-Network Summaries

In-network Summaries

Spanning tree T(V,E’)

+

Models Mv for all nodes v

Mv represents the whole subtree rooted at v.

Page 5: Approximating Sensor Network Queries Using In-Network Summaries

Model Complexity

Need for compression

Gaussian distributions at the leaves:• good for modeling individual node

measurements

Page 6: Approximating Sensor Network Queries Using In-Network Summaries

Talk “outline”

Compression

TraversalConstruction

In-network summaries

Page 7: Approximating Sensor Network Queries Using In-Network Summaries

Collapsing Gaussian Mixtures Compress an m-size

mixture to a k-size mixture.

Look at simple case (k=1) Minimize KL-

divergence?“Fake” mass

Page 8: Approximating Sensor Network Queries Using In-Network Summaries

Quality of Compression

Depends on query workload

Query with acceptable error window WQuery with acceptable error window W’<W

Page 9: Approximating Sensor Network Queries Using In-Network Summaries

Compression

Accurate mass inside interval

No guarantee on the tails

maxz

f (x)dxz−w

z+w∫€

N(μ,σ 2)dxμ−w

μ+w∫ = N i(μ i,σ i2)dx

μ−w

μ+w∫i∑

Page 10: Approximating Sensor Network Queries Using In-Network Summaries

Talk “outline”

Compression

TraversalConstruction

In-network summaries

Page 11: Approximating Sensor Network Queries Using In-Network Summaries

Query Satisfaction A response R={r1…rn} satisfies query Q(w,δ) if:

In expectation the values of at least δn nodes lie within [ri-w,ri+w]

f i(x)dxri −w

ri +w∫i∑ ≥ δn

In-network summary

Q

R [r1, r2, r3, r4, r5, r6, r7, r8, r9, r10]

Within error bounds

Page 12: Approximating Sensor Network Queries Using In-Network Summaries

Optimal Traversal Given: tree and models Find: subtree such that

T =G(V ,E)

Mv

G(V ',E '), E '⊆ E

Mass(Mv,w) ≥ δnleaves∑

Can be computed with Dynamic Programming

response [μleaves]

Page 13: Approximating Sensor Network Queries Using In-Network Summaries

Greedy Traversal If local model satisfies

Return μ Else descend to child node

f (x)dxμ−w

μ+w∫ ≥ δ

More conservative solution:enforces query satisfiability on every subtree instead of the whole tree

Page 14: Approximating Sensor Network Queries Using In-Network Summaries

Traversal Evaluation

Page 15: Approximating Sensor Network Queries Using In-Network Summaries

Talk “outline”

Compression

TraversalConstruction

In-network summaries

Page 16: Approximating Sensor Network Queries Using In-Network Summaries

Optimal Tree Construction Given a structure, we know how to build

the models

But how do we pick the structure?

Page 17: Approximating Sensor Network Queries Using In-Network Summaries

Traversal = cut

Theorem: In a fixed fanout tree, the cost of the traversal is where |C| is the size of the cut, and F the fanout

FF−1 |C | −1( )

Intuition: minimize cut size

Group nodes into a minimum number of groups which satisfy the query constraints

Clustering problem

Page 18: Approximating Sensor Network Queries Using In-Network Summaries

Optimal Clustering Given a query Q(w,δ), optimal clustering

is NP-hard Related to the Group Steiner Tree Problem

Greedy algorithm with factor log(n) approximation Greedily pick max size cluster Issue: does not enforce connectivity of

clusters

Page 19: Approximating Sensor Network Queries Using In-Network Summaries

Greedy Clustering Include extra nodes to enforce connectivity

Augment clusters only with accessible nodes (losing the logn guarantee)

Page 20: Approximating Sensor Network Queries Using In-Network Summaries

Clustering comparison 2 distributed clustering algorithms are compared to the centralized

greedy clustering

Page 21: Approximating Sensor Network Queries Using In-Network Summaries

Talk “outline”

Compression

TraversalConstruction

In-network summariesEnriched models

Page 22: Approximating Sensor Network Queries Using In-Network Summaries

Enriched models Support more complex models

k-mixtures• Compress to a k-size mixture instead of a SGM

Virtual nodes• Every component of the k-size mixture is stored as a

separate “virtual node” SGMs on multiple windows

• Maintain additional SGMs for different window sizes

More space, more expensive model updates

(SGM = Single Gaussian Model)

Page 23: Approximating Sensor Network Queries Using In-Network Summaries

Evaluation of enriched models

SGM surprisingly effective in representing the underlying data

Page 24: Approximating Sensor Network Queries Using In-Network Summaries

Sensitivity analysis

Talk “outline”

Compression

TraversalConstruction

In-network summaries

Page 25: Approximating Sensor Network Queries Using In-Network Summaries

Tree Construction Parameters and Effect on Performance Confidence

Performance for workloads of different confidence than the hierarchy design

Error window Broader vs narrower ranges of window sizes Assignment of windows across tree levels

Temporal changes How often should the models be updated

Page 26: Approximating Sensor Network Queries Using In-Network Summaries

ConfidenceWorkload of 0.95 confidence

Design confidence does not have a big impact on performance

Page 27: Approximating Sensor Network Queries Using In-Network Summaries

Error windows

A wide range is not always better, because it forces the traversal of more levels

Page 28: Approximating Sensor Network Queries Using In-Network Summaries

Model Updates

Page 29: Approximating Sensor Network Queries Using In-Network Summaries

Sensitivity analysis

Conclusions

Analyzed compression schemes for in-network summaries

Evaluated summary traversal Studied optimal hierarchy construction Studied increased complexity models

Showed that simple SGM are sufficient Analyzed the effect on efficiency of various

parameters

Compression

TraversalConstruction

In-network summariesEnriched models