ecse 6962 lecture presentation1 compression in correlated data aggregation zhenzhen ye oct. 24th,...
TRANSCRIPT
ECSE 6962 lecture presentation 1
Compression in Correlated Data Aggregation
Zhenzhen YeOct. 24th, 2005
ECSE 6962 lecture presentation 2
Outline
Background Information Fidelity Compression via Coding
Data Aggregation with Compression Two basic strategies
Aggregation with Distributed Source Coding Aggregation with Explicit Communication
An Example
Conclusion & Future Work
ECSE 6962 lecture presentation 3
Background
Data Aggregation in Wireless Sensor Networks reduce energy cost, prolong network lifetime, etc.
Aggregation Functions Simple functions: max, min, average, etc. Field reconstruction - much more challenging!
Example [1]: Measuring, conveying and reproducing a temperature field
Region G
ECSE 6962 lecture presentation 4
Data Correlation
A Motivation for Compression Spatial Correlation & Temporal Correlation Similar to image compression in a distributed fashion
Correlation Structure Model A stationary random process Y(x, t);
Example: One-dimensional Gaussian random field [2]
wheremeasures the intensity of correlation;scaling factor of temporal correlation; 1/2: Gauss-Markov field model; 1: squared distance model;
2 2 21 1 2 2 1 2 1 2[ ( , ) ( , )] exp[ (( ) ( ) ) ]kE Y x t Y x t c x x t t
c
{1/ 2,1}k
ECSE 6962 lecture presentation 5
Information Fidelity
Fidelity Measure - distortionOriginal Field: X, Reconstructed Field X
Distortion d(X, X) - difference between two field values;Example: Mean-Square Error E[(X - X)2]
Distortion Sources [1, 2, 3] Finite number of sensors - interpolation error, spatial
distortion; Conversion of Continuous-magnitude samples to a
discrete format; Lossy Coding for discretized samples; Delay - time distortion in real-time applications;
ECSE 6962 lecture presentation 6
Information Fidelity (Cont’)
Total distortion [1]
( ) is the value at position in the original (reconstructed) field; when sensors are dense, the distortion due to interpolation error is negligible (i.e., satisfying Nyquist sampling theorem),
i.e., total distortion is approximately determined by the data processing at each sensor (if time distortion is not considered).
1 ˆ[ ( , ), ( , )]| | G
D d S x y S x y dxdyG
( , )S x y ˆ( , )S x y ( , )x y
1
1 ˆ( , )N
n nn
D d S SN
ECSE 6962 lecture presentation 7
Compression via Coding
QuestionWhat is the minimal data rate for the given distortion constraint D?
Rate-Distortion Function [4]
where the minimization is over all conditional distributions for which the joint distribution satisfies the expected distortion constraint and the average mutual information between and is
ˆ ˆ( | ): [ ( , )]
ˆ( ) min ( ; )p S S E d S S D
R D I S S
ˆ( | )p S Sˆ( , )p S S
ˆ
ˆ( , )ˆ ˆ( , ) ( , ) logˆ( ) ( )S
p S SI S S p S S dS
p S P S
S S
ECSE 6962 lecture presentation 8
Compression via Coding (Cont’)
Quantization + Entropy Coding Rate-Distortion Theorem provides a lower bound for the
achievable information rate; - “ideal encoder” A more practical way
where Ik is the index of quantization cell in which X(k) lies. Entropy coding is lossless to quantized data; The size of quantization cell should satisfy distortion
requirement;
Sampler QuantizerEntropy
coder
samp/sec
X(t) X(k) Ik
ECSE 6962 lecture presentation 9
Lossless Coding
Three types of lossless coding schemes for data aggregation in wireless sensor networks [1] Independent encoding and decoding; Conditional encoding and decoding; Distributed source coding (DSC) - Slepian-Wolf [4];
ECSE 6962 lecture presentation 10
Lossless Coding (Cont’)
Independent encoding and decoding Each sensor encodes its quantization index
independently; Simplest, but no compression gain (blind to the
correlation); Number of generated bits from all sensors to be sent to
the sink
which is the same for any routing structure.
( )1 2
ˆ ˆ ˆ( ) ( ) ... ( )ICNT H S H S H S
ECSE 6962 lecture presentation 11
Lossless Coding (Cont’)
Conditional encoding and decoding Each sensor encodes its local quantization index conditioned
on the received side information (indices) from its descendants in the routing tree;
Compression gain depends on the routing structure; Explicit communication among nodes; Partially takes advantage of the correlation structure; Number of generated bits from all sensors to be sent to the
sink ( )
1
ˆ ˆ( | the set of known at node ) N
ECn i
n
T H S S n
ECSE 6962 lecture presentation 12
Lossless Coding (Cont’)
Distributed Source Coding - Slepian-Wolf [4,5] Theorem [4]: Let be jointly ergodic sources
with distribution . Then the set of rate vectors achievable for distributed source coding with separate encoders and a common decoder is defined by
for all where
and Example: 2-source case
1 2( , ,..., )i i miX X X1 2( , ,..., )mp x x x
( ) ( ( ) | ( ))cR U H X U X U{1,2,..., }U m
( ) ii U
R U R
( ) { : }jX U X j U
1 1 2 2 2 1
1 2 1 2
( ) ( | ) , ( ) ( | )
( , ) ( , )
R X H X X R X H X X
R X X H X X
ECSE 6962 lecture presentation 13
Lossless Coding (Cont’)
Distributed Source Coding - Slepian-Wolf [4,5] Each sensor encodes its index without knowing other sensor
indices, but with the assumption that the decoder will know other sensor indices at the time of decoding;
Full knowledge of the correlation structure; Number of generated bits from all sensors to be sent to the
sink
No redundancy in generated traffic! is independent of the choice of the routing structure and
High complexity in encoding; distributed implementation?
( )1 2 1 1 1 1
ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ( ) ( | ) ... ( | ,..., ) ( ,..., )SWN N NT H S H S S H S S S H S S
( )SWT( ) ( ) ( )SW EC ICT T T
ECSE 6962 lecture presentation 14
Aggregation with Compression
Based on different coding schemes, there are two types of aggregation (routing) strategies: Aggregation with Distributed Source Coding (Slepian-
Wolf) [5]; Aggregation with Explicit Communication (EC) [6, 7];
Routing Driven Compression (RDC); Compression Driven Routing (CDR);
ECSE 6962 lecture presentation 15
Aggregation with DSC
If sensor nodes have perfect
knowledge about their correlation, they can encode/compress data so as to avoid transmitting redundant information; then each source can send its encoded data to the sink along the shortest path without the need for intermediate aggregation.
ECSE 6962 lecture presentation 16
Aggregation with EC
Routing Driven Compression (RDC) [6]:The sensors do not have any
knowledge about their correlation and send data along the shortest paths to the sink while allowing for OPPORTUNISTIC aggregation
wherever the paths overlap.
Compression is not the
objective, only an opportunity.
ECSE 6962 lecture presentation 17
Aggregation with EC
Compression Driven Routing (CDR) [6]:The sensors have no knowledge of
the correlations but the data is aggregated close to the sources and initially routed so as to allowfor maximum possible aggregationat each hop. Eventually, the collecteddata is sent to the sink along the shortest possible path.
Compression is the objective, but CDR in [6] is only an extreme case. No Optimality in transmission structure!
ECSE 6962 lecture presentation 18
An Example
ObjectiveMinimize the total transmission cost of transporting the information collected by the sources, to the sink.
Optimization (Joint) Transmission (routing) structure from sources to the sink; (Information) Rate allocation at each source;
Strategies DSC (Slepian-Wolf); Explicit Communication (EC);
Assumptions Distortion is handled by quantization; Snap-shot aggregation (No temporal correlation is considered);
ECSE 6962 lecture presentation 19
Problem Formulation [5]
Graph G=(V,E), |V|=N+1; Sources Vs={1,2,…,N};
Sink {N+1}; Edge e=(i,j) with weight we;
where Ri is the net traffic (i.e.,
rate) generated at source i and di is
1
* *1
{ , } 1
{ , } arg min ( )N
i i i
NN
i i i i iR d i
R d F R d
i ee E
d w
ECSE 6962 lecture presentation 20
Strategy 1 - DSC
Corollary 1 - Optimality of the shortest path tree (SPT) for the single-sink data aggregation problem [5]: When there is a single sink in the data aggregation problem and Slepian-Wolf coding is used, the SPT is optimal, in terms of minimizing the total flow cost, for any given rate allocation.
The joint optimization problem is separated First, optimizing the transmission structure by building
SPT; Second, optimizing the rate allocation for the given SPT;
ECSE 6962 lecture presentation 21
Strategy 1 - DSC (Cont’)
When the SPT is found, assume path weights from sources to the sink are ordered as
For a linear cost function F(R), the optimal rate allocation is i.e., most of load to nodes
close to the sink, small rates to nodes at extremity of the
network.
1 2( ) ( ) ... ( )SPT SPT SPT Nd X d X d X
1{ }min ( ) ( )
s.t. ( | )
Ni i
s
i SPTR i V
ci s
i Y
F R d i
R H Y Y Y V
*1 1
*2 2 1
*1 1
( ),
( | ),
......
( | ,..., ),N N N
R H X
R H X X
R H X X X
ECSE 6962 lecture presentation 22
Strategy 1 - DSC (Cont’)
Difficulties for distributed implementation The order of the path weights is required as a-priori
knowledge to allocate rate (i.e., Global knowledge of SPT); Global knowledge of the correlation structure is required
for the node to calculate conditional entropy;
Approximated Slepian-Wolf coding [7] Assumption - correlation decays fast with distance; Algorithm - locally order the path weights:
Find SPT; For each node i: find in the neighborhood N(i) the set Ci of
nodes that are closer to the sink, on the SPT, than node i; Coding rate Ri = H(Xi | Ci) for its quantization index;
ECSE 6962 lecture presentation 23
Strategy 1 - DSC (Cont’)
Performance loss of approximated Slepian-Wolf [7] Gaussian random field
(squared distance model); Area size: 100*100; 50 nodes, uniform dis. Intensity of correlation
c = 0.001(high) ~ 0.01(low)
ECSE 6962 lecture presentation 24
Strategy 2 - EC
Conditional encoding is used; Rate allocation and transmission structure selection are
NOT separable; SPT is not necessary to be optimal in transmission structure; Example:
link weight = 1; Shortest Path Tree (SPT, left); Traveling Salesman Path (TSP, right); if r < 0.5R, TSP is better than SPT;
ECSE 6962 lecture presentation 25
Strategy 2 - EC (Cont’)
Problem formulation: Find: the spanning tree ST = {T, L} with T (non-terminal
nodes) and L (leaves), T U L = V;Such that
where, a simple correlation model is assumed: the rate at leaves is R and is r at non-terminal nodes; correlation coefficient is
NP-complete for general [7]
arg min ( ) (1 ) ( )ST STL
l L i V
ST d l d i
1 /r R
0 1
ECSE 6962 lecture presentation 26
Strategy 2 - EC (Cont’)
Approximation Algorithms [7] SPT: reference scheme; Greedy algorithm: start from an initial subtree only with
the sink; then, successively, add the node who causes the least cost increment to the existing subtree.
Leaves deletion algorithm: start from SPT; check possible cost improvement by making leaf nodes change their parent to some other leaf node in their neighborhood;
Balanced SPT/TSP: builds SPT up to a radius away from the root and then builds TSP starting from the leaves of SPT in their respective sub-regions;
( )q
ECSE 6962 lecture presentation 27
Strategy 2 - EC (Cont’)
Performance of Approximation Algorithms [7]
ECSE 6962 lecture presentation 28
Comparison: DSC vs. EC
One-dimensional case
DSC EXPLICIT COMMCoding Complexity High Low
Optimal Route Design Simple HardFull Knowledge of
Correlation StructureRequired Not required
Generated Bits DSC <= ECTotal Flow Cost ?
ECSE 6962 lecture presentation 29
Comparison: DSC vs. EC (Cont’)
Cost ratio [5]:
If entropy rate > 0
If entropy rate = 0 and
SW
EC
cost ( )( )
cost ( )
NN
N
lim ( ) 1N N
1 1( | ,..., ) (1/ )pi iH X X X i
(0,1) lim ( ) 1
1 lim ( ) 0N
N
p N p
p N
ECSE 6962 lecture presentation 30
Conclusion & Future Work
A fundamental tradeoff is information fidelity and compression gain;
For the given distortion, different coding schemes combined with routing strategies can achieve different gains;
The joint optimization for routing and rate allocation (coding) is generally difficult;
DSC (Slepian-Wolf) is elegant in separation of the joint optimization problem, but its distributed implementation is still under investigation [8, 9];
Aggregation with EC is practical, but good approximation algorithms are needed for finding the optimal transmission structure for general correlated sources;
ECSE 6962 lecture presentation 31
Conclusion & Future Work
Multiple-Sink case is more complicated: [5] shows that the problem to find the
optimal transmission structure is NP-complete even with Slepian-Wolf coding;
Multiple rate problem at the single
node - increasing coding complexity; Lossy compression More interesting optimization
problems (e.g., sensor density, placement, etc.)
ECSE 6962 lecture presentation 32
References
[1] D. Neuhoff, “Field-Gathering Sensor Networks, Distributed Encoding and Oversmapling”, Canadian workshop on Information Theory, May 2003.
[2] R. Cristescu and M. Vetterli, “On the Optimal Density for Real-time Data Gathering of Spatio-Temporal Processes in Sensor Networks”, in the Proc. of ACM IPSN’05, 2005.
[3] A. Scaglione and S. Servetto, “On the Interdependence of Routing and Data Compression in Multi-hop Sensor Networks”, to appear in ACM/Kluwer Journel on Mobile Networks and Applications (MONET), also see the short version in MobiCom 2002.
[4] T. Cover and J. Thomas, Elements of Information Theory, John Wiley and Sons, Inc., 1991.
[5] R. Cristescu, B. Beferull-Lozano and M. Vetterli, “Networked Slepian-Wolf: Theory, Algorithms and Scaling Laws”, to appear in IEEE Trans. on Information Theory, 2005.
[6] S. Pattem, B. Krishnamachari and R. Govindan, “The Impact of Spatial Correlation on Routing with Compression in Wireless Sensor Networks”, in the Proc. of ACM IPSN’04, Apr 2004.
[7] R. Cristescu, B. Beferull-Lozano and M. Vetterli, “On Network Correlated Data Gathering”, in the Proc. of IEEE Infocom’04, 2004.
[8] S. Pradham, J. Kusuma and K. Ramchandran, “Distributed Compression in a Dense Microsensor Network”, IEEE Signal Processing Magazine, pp.51 - 60, Mar 2002.
[9] Aaron, A., Girod, B.: Compression with Side Information Using Turbo Codes, in Proc.IEEE Data Compression Conference (DCC’02), Snowbird, UT, pp. 252-261, Apr 2002.