network time synchronization using decentralized kalman...

Network Time Synchronization Using Decentralized Kalman Filtering

Maxime Cohen

Network Time Synchronization Using Decentralized Kalman Filtering

Research Thesis

Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science

Maxime Cohen

Submitted to the Senate of the Technion - Israel Institute Technology

Cheshvan 5769 Haifa October 2009

This research thesis was done under the supervision of Prof. Nahum Shimkin in the department of Electrical Engineering.

Acknowledgement I would like to thank Prof. Nahum Shimkin for his guidance, for his advice and support, and for the great working atmosphere during all the stages of the research. I would also like to thank the reviewers for their precious and helpful insights on this thesis. Special gratitude to my parents for their love and for the education they gave me, with its enormous emphasis on academic excellence.

The generous financial help of the Technion is gratefully acknowledged

a

Contents Abstract 1

1. Introduction 5

2. Scientific Background 7 2.1 Least-Squares Fit...……………………………………………………………......7 2.2 Kalman Filtering………………………………………………………………....10 2.3 Kalman Filter - Information Form…………..…………………………………...12 2.4 Facts from Graph Theory……..……………..…………………………………...12 2.5 Matrix Analysis…………………………………………………………………..14

3. Related Work 15 3.1 Time Synchronization Protocols……………………………………………….…15 3.2 Decentralized Estimation………………………………………………………....18 3.3 Consensus Algorithms…………………………………………………………....20

4. Model and Problem Definitions 21

4.1 Clock Model……………..………………………………………………..….…..21 4.2 The Measurements……………………………………………………….…..…...21 4.3 Problem Formulation………………………………………………………..…....22 4.4 State Space Model…………………………………………………………….......22

5. Centralized and Decentralized Algorithms: Single Measurement Update 24 5.1 Kalman Filter Equations………………………………….………………….…...24 5.2 The LS approach………….……………………………….……………………...25 5.3 Decentralized Algorithm: Optimality Equations…………………….……….......26 5.4 Decentralized Algorithm: Iterative Equations……………………….……….......30

6. Convergence Analysis 33

7. Recursive Algorithms: Multiple Measurement Update 39 7.1 The Centralized KF Algorithm……….…………...…………………..………......39 7.2 The Optimal Decentralized Algorithm…………….………………….…..….........40

b

7.3 Equivalence with the KF Equations…….……………………………….....………48 7.4 A Sub-Optimal Decentralized Algorithm…………..………………………...…....49

7.5 Decentralized Non-Recursive Algorithm……….…………………………...……52

8. Extensions 54 8.1 Incorporating a Discount Factor…….………………………………..……….......54 8.2 Addition of Process White Noise.…..……………………………….…..….........57 8.3 Faulty Communication Environment….…………………………………….…....60

9. Clock Skew Estimation 61 9.1 Combined Skew and Offset Estimation…….………………………..……...........62 9.2 Separate Skew and Offset Estimation………...…………………………..............65

10. Numerical Results 73 10.1 Algorithms Description…………………………………………….………........73 10.2 Network Topologies………..…………..…………………………..…...….........74 10.3 Graphical Results…………………….…………………………………….….....76

11. Conclusion and Future Work 91

References 93

Appendix A - Equivalence between KF and LS 97

Appendix B - Maximum-Likelihood and Maximum A-Posteriori Estimators 99 Appendix C - CTP Numerical Results 101

Hebrew Abstract I

c

List of Figures

2.1 Fitting a set of data points using a quadratic function……………………….…….7

2.2 Typical Kalman Filter application, from Maybeck [27]……………………….....10

2.3 Example of a 5-node network………………………………………………….…13

4.1 Communication between two neighboring nodes………………………….….….21 9.1. Communication between two neighboring nodes for the skew estimation problem……………………………………………….....65 10.1 A general network with internal loops ………………………………………...…75 10.2 Comparison between the decentralized CTP and WLS algorithms in Network 2……………….……………………………………………….....…77 10.3 Comparison between the decentralized CTP and WLS algorithms in Network 2, with R I= ……………………………………………………..…78 10.4 Comparison between the decentralized CTP and WLS algorithms in Network 2, with different ijr ……….……………………………………….…79

10.5 Comparison between the decentralized CTP and WLS algorithms (with R and with 2R ) in Network 2…………………………………………..………...…80

10.6 Comparison between the decentralized CTP and WLS algorithms (with additive Gaussian noises in R ) in Network 2..…………..………….……….......81

10.7 Comparison between the decentralized CTP and DKF algorithms (with ( )1

0P diag− → ∞ and R I= ) in Network 2……………...…….……...……….…82

10.8 Comparison between the decentralized CTP and DKF algorithms (with different ( )0 ii

P and R I= ) in Network 2….………………………….….…..…..83 10.9 Comparison between the decentralized CTP and DKF algorithms (with 10% of nodes synchronized via GPS) in Network 2………………..….….……..84 10.10 Comparison between CKF and SOA (with 0P R I= = ) in Network 3 for different values of T ………………………….…………………......…….....86

d

10.11 Variance comparison between CKF and SOA (with 0P R I= = ) in Network 3…………………...…………………………….………..……...…..87

10.12 Comparison between CKF, SOA and CLS (with [ ]0.01,12R U∼ and 0P I≠ ) in Network 3 for different values of T ..……………….……….…..89

10.13 Variance comparison between CKF and SOA (with [ ]0.01,12R U∼ and 0P I≠ ) in Network 3………………………….……………...………….......90

C.1 Distribution of the clock offsets for each algorithm in Network 1…………........101 C.2 Probability Density Function of the clock offsets for each algorithm in Network 1.………………………………………………………………...…..102 C.3 The clock offsets dispersion in Network 1……………………..………….….…103 C.4 Rate convergence analysis of the decentralized CTP algorithm in Network 2.…………………………………………………………...…...…...104

e

List of Tables

5.1. Summary of the Decentralized Synchronous Kalman Filter Algorithm…………………………………………………….........32 7.1. Summary of the Decentralized Synchronous Recursive KF Algorithm………………..………………………………………………..…..47 7.2. Summary of the Decentralized Sub-Optimal Algorithm………………….….......51

1

Abstract Accurate clock synchronization is important in many distributed applications, both in wire line and wireless computer networks. Time synchronization between the nodes of a network was extensively treated in the literature, where several methods and algorithms were proposed to solve this problem efficiently. In the Internet for example, the “Network Time Protocol” (NTP) is the most widely accepted standard for clock synchronization. In some recent work, improved algorithms that rely on Least-Squares estimation were introduced. The accuracy of clock synchronization was improved by imposing the global constraints for all the loops in the multihop network and the use of a distributed algorithm employing only local broadcasts. A central characteristic of these methods is their decentralized structure that requires only local communication with neighbors. In this research, we will extend the Least-Squares framework by developing algorithms that estimate the offset of the local clock at each network node, using a Kalman Filter framework. We will present a synchronous decentralized implementation of the filtering algorithm that employs only local broadcasts and we will prove that it converges to the optimal centralized solution. The Kalman Filter framework allows exploiting some a-priori knowledge and providing different weights to the measurements according to their accuracy. The next step is to consider the multiple measurement case and to present a recursive version of these algorithms. The recursive algorithm computes the optimal offsets and the corresponding variances after receiving each set of measurements in a decentralized manner. Finally, we will extend the results to the estimation of the clock skew (i.e., rate deviation) in addition to its offset. Then, we will consider different extensions of the basic algorithm. We will incorporate a discount factor in the objective function and treat the case where temporary communication failures are considered. We also present simulation results over several network topologies for evaluating and comparing the accuracy of the proposed time synchronization schemes. We will provide several interesting comparisons and as expected, the Kalman Filter approach outperforms the existing algorithms.

2

Summary of Notations

A Reduced incidence matrix iα Skew (rate deviation) of node iΛ

ˆ jiα Estimated skew ratio between nodes iΛ and jΛ b Bias (constant random noise) B Bias covariance matrix [ ],Cov ⋅ ⋅ Covariance function γ Forgetting factor

id Deviation from the i-th data point

klδ Kronecker's delta Π Objective function in the best fitting curve problem

ije Bidirectional link between nodes iΛ and jΛ

[ ]E ⋅ Mathematical expectation

ijε Additive noise in the measurement model between nodes iΛ and jΛ J Objective function k Iteration number mk Probe packet number TL Matrix transpose

m Number of edges in the network M Iteration matrix n Discrete time index Ν Number of nodes in the network

iΝ Number of elements in the set iN

( ),N μ Σ Gaussian density with mean μ and covariance matrix Σ

x∇ Gradient operator with respect to x ˆ

ijO Measurement between the pair iΛ and jΛ

{}P ⋅ Probability

{ }|P ⋅ ⋅ Conditional probability

( | )P n n Error covariance matrix at time n given observations up to and including time n

( )1|P n n+ Error covariance matrix at time 1n + given observations up to and including time n

( )Mρ Spectral radius of M

0P Initial known covariance matrix

( )10 *i

P − Row number i of the matrix 10P −

jir Variance of the measurement ˆijO

1R− Inverse covariance matrix of the measurement noise iΛ Node number

3

t Real time (reference time) bT Number of measurement sets in the skew estimation problem ( )iT t Local time (at node iΛ )

ST Sampling interval

mt Transmission time of packet mk

mt Received time of packet mk

iτ Offset of node iΛ *τ Optimal centralized solution

( )u n External input at time n in the state space model ( 1)v n + Measurement noise at time 1n + in the state space model

( )w n Process noise at time n in the state space model

iiW Weight number i in the WLS problem

0x Initial known state vector ( )x n Sate vector at time n

ˆ( | )x n n State estimate at time n given observations up to and including time n ˆ( 1| )x n n+ State estimate at time 1n + given observations up to and including time n

( )ij mx k Propagation delay of packet mk between nodes iΛ and jΛ ( )X n Augmented state space vector at time n y Measurement vector

( 1)z n + Measurement vector at time 1n + in the state space model

4

Summary of Abbreviations

BLUE Best Linear Unbiased Estimator CTP Classless time Protocol CKF Centralized Kalman Filter CLS Centralized Least-Squares DKF Decentralized Kalman Filter FTSP Flooding Time Synchronization Protocol GM Gauss–Markov GPS Global Positioning System IID Independent Identically Distributed KF Kalman Filter

LQG Linear-Quadratic-Gaussian LQR Linear-Quadratic Regulator LS Least-Squares

MAP Maximum-A-Posteriori ML Maximum-Likelihood

MMSE Minimum Mean Squared-Error NTP Network Time Protocol OLS Ordinary Least-Squares PDF Probability Density Function PSD Positive Semi-Definite RBS Reference-Broadcast Synchronization RLS Recursive Least-Squares

SNTP Simple Network Time Protocol SOA Sub-Optimal Algorithm

s.t Such That UTC Coordinated Universal Time WSN Wireless Sensor Network

5

1. Introduction Accurate clock synchronization is required in many distributed applications in computer networks (e.g., sleep scheduling in the case of low duty cycle [24], and tracking in wireless sensor networks [33]). Moreover, network time synchronization is a critical component for commercial organizations that rely on several computers, all of which have clocks that are the source of time for the files or operations they handle. When clocks of the different components on such systems are not synchronized, data can be lost, processes can fail, the exposure increases and security is compromised. The task of synchronizing clocks in distributed systems is usually accomplished via the exchange of standard messages (probe packets) between the distributed entities in order to coordinate their time. We will assume for simplicity that the links are bi-directional, the network topology is time-invariant and that each node is capable of sending and receiving messages from its neighbors. There is a large literature on how to synchronize clocks in traditional networked systems; among these, the “Network Time Protocol” (NTP) is the most widely accepted standard for synchronizing clocks over the Internet [28-30]. More recently, a novel approach for time synchronization termed CTP – Classless Time Protocol [14] was proposed. This non-hierarchical approach exploits convex optimization theory in order to evaluate the impact of each clock offset on the overall objective function. It was shown that CTP substantially outperforms hierarchical schemes such as NTP in the sense of clock accuracy with respect to a universal clock, without increasing complexity. An alternative proposed approach is the well known Least-Squares Estimator in [41, 12]. The accuracy of clock synchronization was improved by exploiting global network-wide constraints (e.g., the relative offsets are summing up to zero over loops) and the use of a completely asynchronous, distributed algorithm employing only local broadcasts. The central characteristic of these methods relies in their decentralized structure that requires only local communication with neighbors. In estimation theory, for a linear dynamic system under the Gaussian assumption the Kalman Filter (KF) is the optimal MMSE (Minimum Mean Squared-Error) state estimator. If the Gaussian assumption is relaxed, we will obtain the linear optimal MMSE state estimator. The implementation of the KF in a decentralized manner was extensively treated in the literature, as we will see in Section 3. Our objective is to develop efficient decentralized estimation algorithms in order to synchronize the different clocks over the network with respect to the reference time. Without loss of generality, we can assume that Node 1 is synchronized with the universal clock, and we thus have to synchronize the other clocks with respect to it. Firstly, we will consider the case where all the clocks run exactly at the same rate (i.e., there is no clock skew). In this case, our objective reduces to estimate the clock offsets at each network node relative to the clock reference. We will extend the Least-Squares framework by developing algorithms that estimate the offset of the local clock at each network node, using a Kalman Filter framework. The first step is to formulate the model in the state space form where the state is the vector of biases of the clocks in the network. Then, we will show that a single measurement vector update can be done using a distributed iterative scheme that converges to the optimal centralized estimator. The Kalman Filter framework allows exploiting a-priori knowledge about the estimated quantity and providing different weights to the measurements according to their accuracy and quality. We will make the natural assumption that the initial state covariance

6

matrix is diagonal, however we will observe that after the first measurement update of the KF, the state covariance matrix does not remain diagonal. Hence, from this step the standard KF equations cannot be decentralized and each node has to communicate with every other node in the network. This is not a desirable situation since it is prohibitively expensive in terms of communication time. We will solve this issue by proposing a decentralized recursive algorithm that relies on manipulating the standard equations. We rely on the theorem that claims the equivalence between the KF solution and the minimizing vector of a deterministic constrained LS problem. In this way, we will be able to obtain the existing LS solution as a special case. We find the optimal solution by a coordinate differentiation and then we will implement the optimal equation by a synchronous iterative algorithm that employs only local broadcasts. Then, we will prove that it converges to the optimal centralized solution. The next step is to consider the multiple measurement case and to present a recursive version of these algorithms. The recursive algorithm computes the optimal offsets and the corresponding variances in a decentralized manner after receiving each set of measurements. We also consider a simple sub-optimal algorithm that neglects the off-diagonal terms of the inverse covariance matrix. This method reduces significantly the complexity, but looses its optimal property. We will see in the simulation results section that this algorithm leads to poor results. In the extensions section we will incorporate a discount factor in the objective quadratic function to compensate for the time-invariant offsets assumption. Then, we modify our algorithm slightly to make it robust to temporary communication failures. We briefly consider the extension of our results to the estimation of both the offsets and the clock skew (i.e., rate deviation). We will show that the clock skew estimation problem reduces to the same mathematical setup as the offset estimation problem under the appropriate substitutions. For the clock skew estimation problem, we propose different approaches. In the first, the clock skew estimation is performed separately from that of the clock offset. In the second, we propose an optimal combined estimation of both the clock offset and the clock skew. Finally, we present simulation results over several network topologies for evaluating and comparing the accuracy of the proposed time synchronization schemes. We provide several interesting comparisons, where the Kalman Filter approach outperforms the existing algorithms. It is interesting to note that the time synchronization problem is mathematically equivalent to any related distributed estimation problem stemming from relative additive measurements in sensor networks [3]. For example, one can apply the same algorithms to the sensor localization problem. We will briefly elaborate on this point in Section 7. This thesis is organized as follows. In sections 2 and 3, we review the required scientific background and the related work respectively. In Section 4, we describe the model and formulate the problem. Then, in Section 5, we present the different algorithms for the case of single measurement update (both centralized and decentralized versions). Section 6 is devoted to show the convergence of the most general decentralized algorithm to the optimal centralized solution. In Section 7, we provide the recursive version of our algorithm for multiple measurements and an additional non-recursive algorithm. Sections 8 and 9 treat several extensions, like the incorporation of a discount factor and the estimation of the clock skew. Numerical results are presented in Section 10. Finally, the conclusions and some notes on future directions are reported in Section 11.

7

2. Scientific Background 2.1 Least-Squares Fit Let us first consider the Least-Squares (LS) method in a deterministic context and then explain its statistic interpretation. The method of Least-Squares or Ordinary Least-Squares (OLS) is used to solve over-determined systems and can be interpreted as a method of fitting data. This algorithm is often applied in statistical contexts, particularly in regression analysis. The best Least-Squares fit is that instance of the model for which the sum of squared residuals has its lowest value, a residual being the difference between an observed value and the value given by the model. In other words, the method of Least-Squares assumes that the best-fit curve of a given type is the curve that has the minimal sum of deviations squared (least square error) for a given set of data. For example, we can fit the data to a polynomial function, as presented in the following figure:

Figure 2.1. Fitting a set of data points using a quadratic function.

Suppose that the data points are:

( ) ( ) ( )1 1 2 2, , , ,..., ,n nx y x y x y

Here iy are the measured values (data) and ix are the independent variables (unknown). The fitting curve ( )f x has the deviation id from each data point:

( ) 1,2,..i i id y f x i n= − = .

According to the LS method, the best fitting curve has the property that the following expression is minimal:

( ) ( )2 22

1 1( ) min

n n

i i ii i

d d y f x= =

Π = = = − →∑ ∑ (2.1.1) The above minimum in (2.1.1) can be found by setting the gradient to zero. Since the model contains n parameters, we will obtain n gradient equations.

( )f x

x

8

As a special case, the linear LS problem with the following over-determined system ( M linear equations in N unknown variables, with M>N) is considered:

11, 2,..,

N

ij j ij

a x y i M=

= =∑

In a matrix notation:

Ax y= In order to find the optimal LS solution, we have to minimize the following quadratic objective function:

2 miny AxΠ = − → A unique optimal solution is obtained (when 0TA A > ) by solving the normal equations:

( )T TA A x A y= The above equation can be obtained by differentiating the objective function with respect to the vector x and setting the result to zero. Now, we will consider several approaches to iteratively solve the normal equations. Let us define:

T

T

M I A Ay A y

⎧ = −⎪⎨

=⎪⎩

We note that the matrices M and I M− are the projection matrices. Using these notations in the normal equations, we will obtain: This implies: We can implement the above equation through the use of an iterative (synchronous or asynchronous) algorithm and the convergence depends on the structural properties of the matrix M . For example, the synchronous algorithm (all the entries are updated simultaneously) is given by: Here, 0k ≥ is the iteration number. The initial conditions can be randomly chosen and does not affect the convergence of the algorithm in (2.1.2). The above method is known as the Jacobi algorithm and is very common in linear algebra. In this thesis, we will employ this method in a synchronous way to solve the linear Least-Squares problem. We also mention the relaxed form of the Jacobi algorithm. This is an alternative approach that leads to similar equations based on the gradient algorithm. Given the similar linear equations:

Ax y=

( )I M x y− =

x Mx y= +

( 1) ( ) (2.1.2)k kx Mx y+ = +

9

The gradient descent method is given by:

( )( )

( ) ( )

( 1) ( ) ( ) ( )

( 1) ( )

( 1) ( ) ( )1

k k k kx

k k

k k k

x x x I M x y

x I I M x y

x I x Mx y

η η

η η

η η

+

+

+

⎡ ⎤= − ⋅∇ Π = − ⋅ − −⎣ ⎦= − − +⎡ ⎤⎣ ⎦

= − ⋅ + +

Here, η is the step size of the algorithm. One can note that if 1η = , the gradient algorithm reduces to the Jacobi method. Hence, the gradient algorithm is more general and can be viewed as a relaxed Jacobi method. In the case where the Jacobi algorithm does not converge, we can try to use the gradient algorithm and reduce the step size to achieve convergence. Two interesting extensions to the basic LS case are considered: the Weighted Least-Squares (WLS) and the Recursive Least-Squares (RLS). In the WLS method, each data is multiplied by a weighting factor. In other words, the objective function to be minimized is a weighted sum of the form:

( )2

1min

n

ii ii

W d=

Π = →∑ The RLS method is the recursive version of the basic LS algorithm where data arrives progressively. In this particular case, the minimization process is repeated for each set of measurements. Moreover, the most useful form is RLS with exponential data weighting (incorporation of a forgetting factor). In the latter, we consider the scenario where the most recent data is assumed to be more informative than past data and hence we exponentially discard old data. The Least-Squares method also has a statistical interpretation in estimation theory. In a linear model in which the errors have a zero expectation conditional on the independent variables, are uncorrelated and have equal variances (IID), the Best Linear Unbiased Estimator (BLUE) of any linear combination of the observations is its Least-Squares estimator. This result is known as the Gauss-Markov (GM) theorem. "Best" means that the Least-Squares parameter estimators have minimum variance. The assumption of equal variance is valid when the errors all belong to the same distribution. Moreover, in a linear model, if the errors belong to a Normal distribution, the Least-Squares estimators are also the Maximum-Likelihood estimators (as we will show in Appendix B). Aitken [1] showed that when a weighted sum of squared residuals is minimized, the solution is the BLUE if each weight is equal to the reciprocal of the variance-covariance matrix of the observations. This method is known as the Weighted-Least-Squares (WLS) method. In the linear non-deterministic case, there exists a closed form solution to the RLS algorithm that can be implemented through an iterative procedure. For more details on RLS, see [13, 40]. It can be found in Appendix A that the Kalman-Filter algorithm can be viewed as a deterministic LS optimization problem. Next, we present the basic background on Kalman Filtering.

10

2.2 Kalman Filtering The Kalman Filter (KF) is an efficient recursive filter that estimates the state of a linear dynamic system from a series of noisy measurements. It is used in a wide range of engineering applications from radar to computer vision, and is an important topic in control theory and control systems engineering. Together with the Linear-Quadratic Regulator (LQR), the Kalman Filter solves the Linear-Quadratic-Gaussian control problem (LQG) [27, 13]. As seen in Figure 2.2, the KF is fed measurements from the system of interest and produces an estimate of the system state. The system is modeled either as a set of differential equations in the continuous-time case or as a set of difference equations in the case of a discrete-time system. The system model is used to propagate the estimate of the system state forward in time until a new measurement is received. At this point, the system state attained from the measurement is compared to the estimate of the system state and combined in an optimal (MMSE) manner.

Figure 2.2. Typical Kalman Filter application, from Maybeck [27]. In order to use the Kalman Filter to estimate the internal state of a process given only a sequence of noisy observations, one must model the process in accordance with the framework of the KF, i.e., in a state space model notation. This means specifying the matrices , , HΦ Γ for each time-step n as described below. In other words, the KF model assumes the true state at time 1n + is evolved from the state at n according to:

( 1) ( ) ( ) ( ) ( )( ) ( ) ( ) ( )

x n n x n n w ny n H n x n v n

+ = Φ +Γ⎧⎨ = +⎩

The Kalman Filter is a recursive estimator. This means that only the estimated state from the previous time step and the current measurement are needed to compute the estimate of the current state. In contrast to batch estimation techniques, no history of observations and/or estimates is required. The state of the filter is represented by two variables:

11

• ˆ( | )x n n , the state estimate at time n given observations up to and including time n . • ( | )P n n , the error covariance matrix (a measure of the estimated accuracy) at time n

given observations up to and including time n . The Kalman Filter has two distinct phases: prediction and update. The prediction phase uses the state estimate from the previous step to produce an estimate of the state at the current step. In the update phase, measurement information at the current time is used to refine this prediction to arrive at a new, (hopefully) more accurate state estimate. We will next present the equations of the Kalman Filter algorithm. We consider both the standard form and the information form. Consider the following state space model:

( 1) ( ) ( ) ( ) ( )( ) ( ) ( ) ( )

x n n x n n w ny n H n x n v n

+ = Φ +Γ⎧⎨ = +⎩

(0)x is the initial state of the system with the following first and second order statistics:

[ ] [ ] ( )( ) 0(0) (0) cov (0) (0) (0) (0) (0) (0)Tx x x xE x m x E x m x m P P⎡ ⎤= = − − = =⎣ ⎦ .

{ }( )w n is the process noise modeled as a white Gaussian noise with zero mean and covariance ( ) 0Q n ≥ . { }( )v n is the measurement noise modeled as a white Gaussian noise with zero mean and covariance ( ) 0R n > . { } { }( ) , ( ) , (0)w n v n x are uncorrelated, namely:

( ) ( ) (0) ( ) (0) ( ) 0 ,T T TE v n w n E x v m E x w m m n⎡ ⎤ ⎡ ⎤ ⎡ ⎤= = = ∀⎣ ⎦ ⎣ ⎦ ⎣ ⎦ The state estimation cycle is divided into two steps ( n is the discrete time index):

• Time update (prediction):

ˆ ˆ( 1| ) ( ) ( | )( 1| ) ( ) ( | ) ( ) ( ) ( ) ( )T T

x n n n x n nP n n n P n n n n Q n n

+ = Φ⎧⎨

+ = Φ Φ +Γ Γ⎩

• Measurement update:

[ ]

1

ˆ ˆ ˆ( 1| 1) ( 1| ) ( 1) ( 1) ( 1) ( 1| )

( 1) ( 1| ) ( 1) ( 1) ( 1| ) ( 1) ( 1)

( 1| 1) ( 1) ( 1) ( 1| )

T T

x n n x n n K n y n H n x n n

K n P n k H n H n P n k H n R n

P n n I K n H n P n n

−

⎧ ⎡ ⎤+ + = + + + + − + +⎣ ⎦⎪⎪ ⎡ ⎤+ = + + + + + + +⎨ ⎣ ⎦⎪

+ + = − + + +⎪⎩

The initialization is as follows:

ˆ(0 | 0) (0) ; (0 | 0) (0)x xx m P P= = As we will see later, in our case, we have:

12

( ) ( )( ) ( )

;n n I

Q n Q R n RΓ = Φ =

= =

Next, we will review an additional form of the Kalman Filter, called the information form. 2.3 Kalman Filter - Information Form The information form of the Kalman Filter differs in the fact that the covariance prediction and update equations are different. Since the prediction covariance equation is quite complex, another option is to use the regular form of the KF with the modification in the update covariance equation only. Under the same assumptions as those stated previously, the Kalman Filter equations are given by the same equations as before except for the following update inverse covariance equation:

1 1 1( 1| 1) ( 1| ) ( 1) ( 1) ( 1)TP n n P n n H n R n H n− − −+ + = + + + + + (2.3.1) There exists an additional equation for the Kalman gain ( 1)K n + (see for example in [36]):

1( 1) ( 1| 1) ( 1) ( 1)TK n P n n H n R n−+ = + + + + (2.3.2) For more details on the Kalman Filter, one can refer to the original article of R. E. Kalman [18], or to any book on optimal filtering (e.g., [27, 13]). 2.4 Facts from Graph Theory A common method of obtaining estimates of clock offsets between directly communicating pairs of nodes is based on the exchange of time-stamped packets. Viewing the network as a graph, this corresponds to finding estimates of clock offsets across the edges of the graph. These quantities must then be processed by the network to obtain estimates of the clock offsets at each node with respect to the reference clock. Hence it is worthwhile to model the network as a directed graph ( ),G V ε= with V = Ν

nodes { }1 2, ,..., NΛ Λ Λ and mε = edges. Each edge represents the ability to transmit and receive packets between the corresponding pair of nodes. We will focus on an underlying network which consists of the entities that participate in the clock synchronization protocol. Let Ν denote this set of nodes and let Ν be its cardinality (the number of nodes). The edge connecting nodes iΛ and jΛ will be denoted by ije and the collection of all the edges by ε . We will assume throughout this thesis that all the edges are bidirectional, namely that if ije ε∈ , then jie ε∈ . Let us denote by iΝ the set of nodes which are the neighbors of

iΛ , i.e., one edge away from node iΛ , and let iΝ be the number of such neighbors. For simplicity of notation, we exclude the existence of multiple edges between the same pair of nodes and also the edges from a node to itself. We consider a model in which only one out of the Ν nodes is a "reference time node" (the generalization for several reference time nodes is straightforward). Without loss of generality, we may assume that the reference

13

time node is 1Λ . Our objective is to construct the optimal offset estimate for every node

{ }\ 1u V∈ . The dimensions of the incidence matrix A are N (nodes number) × m (edges number). In the row corresponding to node iΛ , we have an entry +1 for all edges of the form (i,*), an entry -1 for all edges of the form (*,i), and 0 otherwise. For a connected graph, the rank of the incidence matrix is 1N − , or one less than the number of nodes. Thus, deleting any row from the incidence matrix yields a full row rank matrix, which is called the reduced incidence matrix. Here, we will work with the ( )1N m− × matrix obtained by deleting the row corresponding to the reference node 1Λ . For notational convenience, we use A to henceforth denote the reduced incidence matrix. We present a simple illustrative example, similar to [41]. Consider the network in Figure 2.3. Here, for the construction of the matrix A , one can randomly choose the direction of the edges without affecting the results. In other words, the links are bidirectional but each link has a single entry in A .

Figure 2.3. Example of a 5-node network.

The corresponding incidence matrix is given by:

If node number 1 is the reference, we will delete the first line of the above matrix in order to obtain the reduced incidence matrix A . We will use this matrix to obtain the state space model of the system in Section 4.4.

1

4

2

3

5 Loop 1 Loop 2

( ) ( ) ( ) ( ) ( ) ( )1,2 2,3 3,4 1,4 2,5 3,51 1 0 0 1 0 02 1 1 0 0 1 03 0 1 1 0 0 14 0 0 1 1 0 05 0 0 0 0 1 1

A

+ +− + +

=− + +

− −− −

14

2.5 Matrix Analysis In the convergence analysis (Section 6), we will show that our decentralized algorithm can be written in the form: ( 1) ( )k kMτ τ+ = . This is a standard iteration equation. Further, a sufficient and necessary condition to obtain convergence to zero from any initial guess is one where the spectral radius (the biggest eigen-value in absolute value) of the matrix M is strictly smaller than one:

( ) 1Mρ < In Non-Negative Matrix Theory (see the chapter on Gersgorin discs in [16]), it is proven that for a non-negative square matrix A , namely:

0 , 1,2,..,ij ijA a a i j n⎡ ⎤= ≥ =⎣ ⎦ We have:

1 1( ) min max , max

n n

ij iji jj iA a aρ

= =

⎧ ⎫≤ ⎨ ⎬

⎩ ⎭∑ ∑

In particular:

1

( ) maxn

iji jA aρ

=

≤ ∑ (2.5.1)

Therefore, if all the row sums of the matrix A are smaller or equal to 1, and all the entries of A are non-negative, then ( ) 1Aρ ≤ (this is only a sufficient condition). In addition, the following sharper result can be obtained. Proposition 2.5. Given a non-negative square matrix A with the following properties:

a) All the row sums of the matrix A are smaller or equal to 1. b) At least in one row this sum is strictly smaller than 1. c) The matrix A is irreducible (i.e. we can move from any node to any other node

through a direct trajectory). Then:

( ) 1Aρ < The proof of this sufficient condition is well known (see e.g., [16], chapter 7). In our network model, the condition that the matrix M is irreducible requires the assumption that the graph that corresponds to the network is connected, namely that there exists a path between any pair of nodes in the network. Some additional important assumptions are that each node has at least one neighbor (not including the reference node) and that links are bidirectional (symmetric).

15

3. Related Work In this section, we present a review of the papers that are most closely related to our work. First, we review the different accepted time synchronization protocols and then we consider a short literature survey on decentralized estimation (essentially on decentralized Kalman filtering), including consensus algorithms. 3.1 Time Synchronization Protocols An early landmark paper in computer clock synchronization is Lamport's work [23] that elucidates the importance of virtual clocks in systems where causality is more important than absolute time. A distributed algorithm is proposed for synchronizing a system of logical clocks that can be used to totally order the events. Although this work focused on giving to the events a total order rather than qualifying the time difference between them, it has emerged as an important influence in sensor networks. There is a large literature written on the art of synchronizing clocks in traditional networked systems. As we previously mentioned, the Network Time Protocol (NTP) is the widely accepted standard for synchronizing clocks over the Internet [28-30] and is notable for being scalable, self-configuring and robust to failures, in addition to being thoroughly tested. Nevertheless, this approach is vulnerable to sending delays and asymmetries in paths, and does not take advantage of the special properties of sensornet broadcasts. NTP is a client/server protocol used for synchronizing the internal clock of computers in standard networks and suggests a complete scheme for synchronizing the clocks with respect to the Coordinated Universal Time (UTC). NTP recommends data filtering and peer selection algorithms in order to reduce the offset which is the time difference between the clock and the UTC. Since NTP is used as a comparison benchmark in our simulation results, we briefly describe the procedure and more details can be found in [28-30]. According to NTP, each node iΛ computes the round trip delay for each probe packet that traverses the edge ije based on the four timing fields recorded on the packet. Each node is sending probe packets to each one of its neighbors. Time is stamped on packet mk by the sender iΛ upon transmission ( ( )i mT k ), and by the receiver jΛ upon reception of the packet ( ( )j mR k ). Then, the node jΛ retransmits the packet back to the source ( ( )j mT k ) and the source stamps its local time when receiving back the packet ( ( )i mR k ). The computed round trip delay for packet mk is given by:

( ) ( )( ) ( ) ( ) ( ) ( )ij m j m i m i m j mRTT k R k T k R k T k= − + − . The clock offset of node iΛ relative to node jΛ 's clock is estimated as:

( ) ( )1 ( ) ( ) ( ) ( )2 j m i m i m j mR k T k R k T k⎡ ⎤− − −⎣ ⎦ .

NTP suggests the "minimum filter", which selects from the n most recent samples the sample with the lowest round trip delay. Each node estimates its relative clock offset with

16

respect to a selected group between its neighbors, where neighbors which are hop count closer to the reference node are preferred, giving NTP its hierarchical nature. Finally, the offsets are averaged. In 2003, the Classless Time Protocol (CTP) was proposed [14]. This protocol reduces the offset errors using a novel non-hierarchical approach that employs a peer to peer protocol in which each node sends and receives probe packets only to and from its neighbors. The approach exploits convex optimization theory in order to evaluate the impact of each clock offset on the overall objective function. In addition, the authors suggest the separation of the round-trip delays to one way components in order to obtain a filtered measurement and to increase the accuracy of the synchronization procedure. It was shown that CTP substantially outperforms hierarchical schemes based on NTP in terms of clock accuracy while preserving similar protocol complexity. Solis, Borkar and Kumar [41, 12] have proposed an approach based on the concept of Least-Squares method, to smooth the set of estimates obtained by a packet exchange procedure. The accuracy of clock synchronization was improved by exploiting global network-wide constraints and the use of a completely asynchronous distributed algorithm employing only local broadcasts. The problem that results can be formulated as a distributed parameter estimation problem. They provide an alternate proof of the connection between the LS optimal set of estimates and electrical resistances in an equivalent resistive network. In addition, they analyze the convergence properties of the distributed synchronization algorithm they proposed. In fact, one can easily show that the CTP algorithm and the LS method are equivalent; the mathematical procedure is similar but written in two different ways. In the scheme Reference-Broadcast Synchronization (RBS) described in [9, 10], an intermediate node transmits a reference packet and the other nodes record the time at which they receive it. They then exchange this recorded time to find the differences between their clocks. The fundamental property of this scheme is that it synchronizes a set of receivers with one another, as opposed to traditional protocols in which senders synchronize with receivers. Hence, RBS is quite accurate because it is completely insensitive to transmission delays and asymmetries. The most significant limitation of RBS is that it requires a network with a physical broadcast channel. It cannot be used, for example, in networks that employ point-to-point links as considered in this thesis. In [9], the authors argue that the time synchronization schemes, like NTP were developed for traditional networks (e.g., the Internet) and are not very efficient in Wireless Sensor Networks (WSNs) applications, where many assumptions have changed. Then, they design the requirements and the principles for WSN time synchronization. More recently, in [26], the Flooding Time Synchronization Protocol (FTSP) is proposed; it uses MAC layer time stamping capabilities to eliminate several sources of error on the time synchronization process, and linear regression to compensate for the possible drifts in the clocks. A leader is elected through message exchanges and the global time is passed from the root to all the other nodes via flooding. Karp, Elson, Estrin and Shenker [21] have considered the problem of minimum variance estimation based on global information, particularly for the RBS scheme of [9], and have shown that it satisfies the transitive property of offsets, i.e., the sum of optimal estimate of

17

the offsets between the node pairs ( ),i jΛ Λ and ( ),j kΛ Λ is the optimal estimate of the

offset between the node pair ( ),i kΛ Λ . They have also analyzed the optimal error variance and related it to the resistance distance in the corresponding graph. Moreover, they show that the optimal pairwise synchronization and the globally consistent synchronization have the same technical answer and they treat clock skew and clock offset on different time scales. There are several proposals for synchronizing clocks within a single broadcast domain (e.g., [31]). These methods exploit the special properties of broadcast media and achieve high precision. However, nodes that do not lie within the same broadcast domain cannot be synchronized. Since our focus is on global clock synchronization, these local approaches are not an efficient solution to our problem. The most straightforward approach to synchronize clocks is to use the Global Positioning System (GPS), a constellation of satellites operated by the U.S. Department of Defense [19]. GPS provides accurate time synchronization relative to UTC [25], but its use is scarce in computer networks. GPS requires sensornet nodes to be equipped with special receivers, clear sky view and continuous reception of multiple satellites which is hard to accomplish inside buildings, underwater or beneath dense foliage. In addition, it may be too large, costly or high-power to a small and cheap sensor node. Another quite different approach is that taken in [37], which does not directly synchronize clocks but instead refers to events in terms of their age. When exchanging these timestamps, they are updated to reflect the passage of time. The last topic we present is related to time synchronization procedures using Kalman filtering. Two different schemes are considered. In [46], a time synchronization model on the Internet using Kalman filtering is proposed. The authors argue that the algorithm is more stable, more accurate and less sensitive to packet loss than the Simple Network Time Protocol (SNTP). SNTP [48] is a simplified version of NTP. The work in [20] is a heuristic approach based on adaptive Kalman filtering. The method is focused on a stochastic model of the network, which employs a KF and redundancy paths to achieve both an improved time and rate synchronization. The tests considered show an improvement of approximately two orders of magnitude in comparison to NTP. The schemes in [46, 20] are strongly related to our work but a number of essential differences exists. The problem considered in these works assumes that the network is composed of only two distinct computers and that just one set of measurements is available. Moreover, the proposed algorithms are totally centralized. On the other hand, in this research we are interested in large-scale systems with numerous nodes and the synchronization procedure has to be decentralized. Moreover, we will investigate the multiple measurement case through a recursive algorithm. In this review, we evoke only the main algorithms to synchronize clocks in computer networks, or more precisely to estimate the offsets at each node with respect to the universal time. We did not present the accepted techniques for adjusting the clocks physically, because it is a solved problem and beyond the scope of this research.

18

3.2 Decentralized Estimation In this part, we review several articles on decentralized (and distributed) estimation and more precisely on Decentralized Kalman Filter (DKF) algorithms. We investigate only dynamical state stochastic estimation and not the various literature on decentralized estimation of a deterministic unknown parameter corrupted by a noise. Centralized implementation of the Kalman Filter [18], although optimal, does not provide robustness and scalability when it comes to complex large-scale dynamical systems with their measurements distributed on a large geographical region. This is the reason why several distributed estimation algorithms using the KF framework were proposed. Much of the existing research on distributed Kalman Filters focuses on sensor networks monitoring low dimensional systems [36]. This scenario addresses the problem on how to efficiently incorporate the distributed observations, also referred to in the literature as 'data fusion' [15]. Among the first works on DKF, the work in [15] assumes the presence of a fusion center or a central coordinator for combining the information from the various local processors. Then, the algorithm in [36] does not require any form of central processing facility. Each sensing node implements its own local KF to arrive at a partial decision which it then broadcasts to every other node. This algorithm leads to the same optimal centralized KF solution and is highly resilient to loss of one or more sensing nodes. The main drawback of this method is that a fully connected network is considered, i.e., each node must talk to every other node and the design of a convenient communication topology is problematic. In [39], the authors present some results on the problem of optimally combining static estimates from different sensors locations when the measurement noise processes are correlated. The works of [7, 44] extend the previous existing theory to the entire class of Luenberger observers. In [7], a necessary and sufficient condition for combining local KF estimates into a global KF estimate is proved. It is shown that decentralized estimators work by combining local estimates through weighting matrices. The low-power filtering scheme described in [44] implements Luenberger observers. By allowing the local stations to communicate at the rate that the estimates are desired, instead of the faster measurement rate, it saves power while maintaining robustness and optimality. All the previous solutions require that the local filters propagate a state vector that is the size of the global state and the knowledge of the sensor network topology. The work in [4] explains the difference between decentralized and distibuted estimation. A decentralized network has no central facilities whereas a distributed network uses reduced order nodes operating in parallel to process local observations. Combining distribution and decentralization gives a new more efficient result and reduces the communication requirements. In [38], the DKF is applied to solve a common robotic application: cooperative localization. A new approach is presented where the centralized KF is decomposed in M modified Kalman filters each running on a separate robot. Moreover, the cross correlation terms between the different agents are computed in a distributed manner as well. A simulation example is considered where the authors show that an improvement in the localization accuracy is provided. In [3], the authors consider the problem of estimating vector valued variables from noisy relative measurements in a decentralized fashion. The time synchronization and the sensor localization are obtained as special cases of this more general framework. Two different

19

algorithms are proposed to compute the optimal estimate in a distributed and iterative manner. The case of temporary communication failures is treated and the algorithms are based on the Jacobi iterative method. The latter and the works performed in [14, 41, 12] formed the basis of the research presented in this thesis. More recently, several methods for sparsely connected large-scale networks were considered. For example, in [17], a computationally efficient, sub-optimal distributed KF is used to estimate a sparsely connected, large-scale dynamical system monitored by a network of N sensors. Local Kalman Filters of much lower dimension than the global state are implemented at each sensor node. Shared observations and estimates across different local models are fused using bipartite fusion graphs and consensus averaging algorithms. The advantage of this scheme is that a low order KF is implemented at each sensor and the structure of the centralized error covariances is conserved. In other words, the proposed solution contrasts with existing methods for sensor networks that either replicate a KF (whose dimension is equal to the global state vector) at each sensor node or reduce the model dimension at the expense of decoupling the field dynamics into lower-dimensional models (non-optimal). In non-linear and non-Gaussian scenarios, the DKF becomes inapplicable. Extended Kalman Filters, grid-based methods and Gaussian-sum filters are possible alternatives, but these all have limitations and information exchange is not as simple. The class of sequential Monte Carlo methods (or particle filtering) is attractive because of its power and flexibility. In [8], distributed implementations of particle filters are proposed. Since our domain of interest here is focused on DKF, we are not providing more details on these methods. The paper in [11] investigates several estimator architectures for determining the fleet state in the formation flying problem. This latter is non-linear and includes correlated states. The analysis shows that the proposed decentralized reduced order filters (like Schmidt-Kalman Filter) provide near optimal estimation results without excessive communication or computation and are preferable when compared to centralized and full order methods. The work in [32] presents an efficient method of multi-sensor estimation for systems with asynchronous observations. The architecture proposed is totally decentralized and each individual loop does not need to know about the other loops in the system. The resulting estimates are equivalent to the optimal centralized filter when the loops incorporate all the information available in the system. Recently, Alriksson et al. [2] addresses the problem of distributed Kalman filtering, with focus on limiting the required communication bandwidth. The authors refer to a scenario where all the nodes desire an estimate of the full state of the observed system, there is no central utility and the communication takes place only between neighbors. The nodes merge their estimates by a weighted average of the neighbouring estimates. The weights are optimized off-line to yield a small estimation error covariance. This problem was generalized to time varying states in [42, 34] using consensus filters. Next, we present a short literature survey on consensus algorithms.

20

3.3 Consensus Algorithms We restrict the review to methods that are associated with data fusion in networks. Consensus problems are related to situations in which all the network members are required to achieve a common output value, using only local interactions and without access to a global coordinator. Consensus filters are distributed algorithms that allow calculation of average-consensus of time-varying signals. Two fundamental papers on consensus algorithms are given in [45, 35]. Xiao and Boyd [45] consider the problem of finding a linear iteration that yields distributed averaging consensus over a network, i.e., that asymptotically computes the average of the initial values given at the nodes. This article is the basis of various works in this field. In [35], the authors provide a theoretical framework for analysis of consensus algorithms for multi-agent networked systems. This is a general overview that investigates static and dynamic topologies as well as the continuous and discrete time cases. In addition, it provides diverse applications that are related to consensus problems and presents the simulation results for three different applications. The work in [42] describes a dynamic consensus in order to obtain a distributed Kalman Filter for a network of agents. The algorithm consists of two loops: an outer loop for the KF and an inner loop for the weighted average consensus updating. Dynamic consensus allows to track in time different quantities and then to use them for distributed estimation. Olfati-Saber [34] solves the problem of distributed Kalman filtering for sensor networks by reducing it to two separate dynamic consensus problems in terms of weighted measurements and inverse covariance matrices. These problems were solved in a distributed way using a low-pass consensus filter for the fusion of the measurements and a band-pass consensus filter for the fusion of the inverse covariance matrices. It leads to an approximate distributed Kalman filtering algorithm that converges to the centralized optimal solution. Recently, in [6] the authors considered the problem of estimating the state of a scalar linear system from distributed noisy measurements. The estimation is performed via a two stage procedure which consists in (i) a standard decentralized Kalman-like measurement update and (ii) estimate fusion using consensus strategies. In order to attain this purpose two design parameters; the Kalman gain and the consensus matrix are designed by optimizing the steady state prediction error.

21

4. Model and Problem Definitions 4.1 Clock Model We suppose that the clock drift at a node follows the linear form: ( )i i iT t tα τ= + , where iα and iτ are the skew (rate deviation) and the offset parameters respectively, t is the real time (or the reference time) and ( )iT t is the local time (at node iΛ ). The above model is known as the two parameters linear model and is considered as a common way to model a clock in this context (see [41] and the references therein). The time synchronization problem relates to the task of setting the clocks in the network so that they all agree upon a particular epoch with respect to a Coordinated Universal Time (UTC). Without loss of generality, one can assume that node 1Λ is synchronized to the reference time:

1 0τ = and 1 1α = In our model, the clock synchronization relates to two different aspects: the rate synchronization (identical iα ) and the time offset synchronization (identical iτ ). For simplicity, we initially assume that all the clocks run at the same speed ( 1, ,i j i jα α= = ∀ ) and then we will relax this assumption in Section 9. 4.2 The Measurements Each node is sending probe packets to each one of its neighbors. Figure 4.1 depicts the situation for the pair of neighboring nodes iΛ and jΛ . Time is stamped on packet mk by the sender iΛ upon transmission ( ( )i mT k ) and by the receiver jΛ upon reception of the packet ( ( )j mR k ). Then, the node jΛ is retransmitting the packet back to the source ( ( )j mT k ) and the source stamps its local time when receiving the packet back ( ( )i mR k ).

Figure 4.1. Communication between two neighboring nodes. We intend to estimate the clock offsets by using these measurements data. Let us denote by

( )ij mT kΔ the time difference between the transmission of probe packet mk by node iΛ , according to iΛ clock, and the receiving time of the packet at node jΛ according to its own clock. This implies: ( ) ( ) ( ) ( )ij m j m i m ij m i j ijT k R k T k x k τ τ εΔ = − = − + + (4.2.1)

iΛ jΛ

22

Here, ( )ij mx k is the propagation delay and ijε is an additive unknown noise that represents the random queuing delay and the other unknown influences. In other words, ( )ij mT kΔ is equal to the one-way link delay experienced by probe packet mk when traveling from node

iΛ to jΛ ( ( )ij mx k ), plus the difference between the two clock offsets. Assuming that ( ) ( )ij m ji mx k x k= (the propagation delay is symmetric) we notice that:

( )12 ij ji j i ijT T τ τ εΔ −Δ = − +

where: ( )12ij ij jiε ε ε= −

4.3 Problem Formulation Our objective is to synchronize all the clocks in the network with the reference time. This is equivalent to estimate (using the Kalman Filter framework) iα and iτ at each network node. The algorithm is required to be decentralized and to converge to the optimal centralized solution. We initially assume that the offsets are time-invariant and that all the clocks run at the same speed (there is no skew), namely:

1, ,i j i jα α= = ∀ . These assumptions are reasonable if the clock synchronization procedure is applied at small enough time intervals. In the extensions section, we treat the case in which the previous assumptions are relaxed. The first step is to find the state space model applied to our problem. Then, we will write the Kalman Filter equations and develop the results in a centralized form. This result will represent the optimal clock offset vector in the MMSE sense. Later, we proceed to develop a decentralized implementation of the preceding filtering algorithm. The last steps include extending our algorithm to the case of different clock skews and to treat several interesting extensions. 4.4 State Space Model Let us define the state vector by the following column vector:

( )1 2( ) 0, ,... TNx n τ τ τ=

Here, iτ is the offset of the node iΛ . In this part, the offsets are assumed to be time invariant and we consider that all the clocks run exactly at the same speed (i.e., 1i jα α= = there is no skew). As we explained before, the measurements data for each pair of neighbor nodes is given by:

( )1ˆ2ij ij ij ji j i ijy O T T τ τ ε= Δ −Δ = − + (4.4.1)

23

Remark: ˆijO is the conventional notation for this category of measurements.

Equation (7) states that the relative measurement ijy for each pair of neighboring nodes is given by the difference between the offsets plus an additive noise ijε . According to the previous notation (see Section 2.4), the measurement equation of the state space model is related to the reduced incidence matrix A . Consequently, the state space model is given by:

( 1) ( ) ( )( ) ( ) ( )T

x n x n w ny n A x n v n

+ = +⎧⎨

= +⎩ (4.4.2)

Here, ( )w n and ( )v n are the system and measurement noises respectively. We assume that the offsets are time invariant, so we will neglect the process noise ( )w n for the time being. In Section 8.2, we will consider the more general case where ( )w n is incorporated back.

( )v n is the measurement noise and is assumed to be in accordance with the Kalman Filter assumptions (see Section 2.2). For simplicity, in all the state space representations it is assumed that the sampling time is uniform and equal to one.

24

5. Centralized and Decentralized Algorithms: Single Measurement Update In this section, we first present the Kalman Filter equations applied to our problem using two different approaches. Then, we develop a decentralized version of this filtering algorithm, and we will prove its convergence to the optimal centralized solution in the next section. By using the KF framework, we obtain the existing results from the literature as special cases and extend them to a more general framework. We consider separately the cases where the initial inverse covariance matrix is diagonal and non-diagonal. We assume here that only one set of measurements is available. We will extend our results to the case of multiple measurement sets in Section 7 by the use of a recursive decentralized algorithm. 5.1 Kalman Filter Equations Let us present the KF equations applied to our case. As we previously stated, the state vector is defined by:

( )1 2( ) 0, ,... TNx n τ τ τ=

Here, iτ is the offset of the node iΛ . In this part, the offsets are assumed to be time invariant and we consider that all the clocks run exactly at the same speed (i.e., 1i jα α= = , there is no skew). As we have previously explained, the state space model is given by:

( 1) ( ) ( )( ) ( ) ( )T


+ = +⎧⎨

= +⎩

In the previous notation, we have: , TI H AΦ = = . The Kalman Filter equations are therefore: Time update (prediction):

ˆ ˆ( 1| ) ( | )

( 1| ) ( | )x n n x n nP n n P n n Q

+ =⎧⎨ + = +⎩

Measurement update:

1

ˆ ˆ ˆ( 1| 1) ( 1| ) ( 1) ( 1) ( 1| )

( 1) ( 1| ) ( 1| )

( 1| 1) ( 1) ( 1| )

T

T

T

x n n x n n K n y n A x n n

K n P n n A A P n n A R

P n n I K n A P n n

−

⎧ ⎡ ⎤+ + = + + + + − +⎣ ⎦⎪⎪ ⎡ ⎤+ = + + +⎨ ⎣ ⎦⎪

⎡ ⎤+ + = − + +⎪ ⎣ ⎦⎩

Let us substitute the time update equation into the measurement update equation:

[ ] ( )

( ) ( ){ }( )

1

1

ˆ ˆ ˆ( 1| 1) ( 1| ) ( | ) ( | ) ( 1) ( 1| )

( 1| 1) ( | ) ( | ) ( | )

T T

T T

x n n x n n P n n Q A A P n n Q A R y n A x n n

P n n I P n n Q A A P n n Q A R A P n n Q

−

−

⎧ ⎡ ⎤⎡ ⎤+ + = + + + + + + − +⎣ ⎦ ⎣ ⎦⎪⎨

⎡ ⎤+ + = − + + + +⎪ ⎣ ⎦⎩

25

Then, the combined Kalman Filter equation (using the information form) is given by:

( )11 1 1ˆ ˆ ˆ( 1| 1) ( | ) ( | ) ( 1) ( | )T Tx n n x n n P n n Q AR A AR y n A x n n−− − −⎡ ⎤ ⎡ ⎤+ + = + + + + −⎣ ⎦⎣ ⎦ (5.1.1)

Next, we will obtain the above equation through the Least-Squares optimization approach. Afterwards, we will develop a decentralized version (requiring only local broadcast) of this filtering algorithm and in Section 6, we will prove its convergence to the optimal centralized solution. 5.2 The LS approach We consider the single measurement update case (only one set of measurements is available) and later on, we will extend the algorithms to the multiple measurement case by proposing a recursive algorithm (Section 7). We start with the pair of parameters 0 0,x P and our goal is to find ˆ optτ by using the Kalman Filter equations in a centralized fashion. 0x and 0P represent the a-priori knowledge and we want to include this information together with the measurements to find an optimal estimate. This is important, because this initial knowledge can improve the quality of the estimation. The KF solution is equivalent to the minimum of the following expression (see the proof in Appendix A):

min1 10 0 0 ˆ( ) ( ) ( ) ( ) (0)T T T TJ x x P x x y A x R y A x x− −= − − + − − ⎯⎯→ (5.2.1)

The first term of the objective function is related to the initial knowledge whereas the second term is associated with the single set of measurements and its corresponding covariance matrix. It is preferable to solve the above deterministic LS problem than to solve the KF equations directly. The main reason is that the KF solution gives only a centralized algorithm and it is difficult to decentralize the procedure, as we will see in the next section. We want to find the vector ˆoptx that minimizes the above objective function. Hence, we have to compute the gradient with respect to the vector x and set it to zero:

( ) ( )( ) ( )

1 1 1 10 0 0

11 1 1 10 0 0

0

ˆ (5.2.2)

Tx

Topt

J AR A P x AR y P x

x AR A P AR y P x

− − − −

−− − − −

∇ = + − − =

= + +

ˆoptx represents the optimal vector of offsets. One can note that the above equation is

equivalent to the combined Kalman Filter equation that we obtained previously. In addition, the posterior error covariance matrix is as follows:

( )( ) ( ) 11 10ˆ ˆ T TE x x x x AR A P

−− −⎡ ⎤− − = +⎣ ⎦ (5.2.3)

26

In the statistical WLS case, we substitute 10 0P − = in the objective function. The centralized

optimal solution is known as the BLUE (Best Linear Unbiased Estimator) and is given by:

( ) ( )11 1ˆ Tx AR A AR y−− −=

In addition, the error covariance matrix is as follows:

( )( ) ( ) 11ˆ ˆ T TE x x x x AR A−−⎡ ⎤− − =⎣ ⎦

In order to compute the optimal estimate directly (in a centralized manner), one seems to need all the measurements associated with their error covariances and the topology of the entire network. For networks with a large number of measurements, doing so will be prohibitively expensive in terms of energy consumption, bandwidth and communication time. Hence, it is more preferable to compute the estimates in a decentralized fashion, employing only local broadcasts. By decentralized we mean that at every step, each node computes its own estimate and the data required are obtained through communication with its one-hop neighbors. 5.3 Decentralized Algorithm: Optimality Equations Our purpose is to develop a decentralized algorithm that estimates the offsets of each node over the network. In this section, we consider only the single measurement case. The multiple measurement scenario will be considered later in Section 7. As in the previous section, we start with the parameters 0 0,x P and our goal is to find ˆ optτ by using the Kalman Filter in a decentralized fashion. In other words, we are looking for the minimizing solution ( )ˆ 0 | 0x of the following objective function:

1 10 0 0( ) ( ) ( ) ( )T T T TJ x x P x x y A x R y A x− −= − − + − −

a) The Basic LS Algorithm We first demonstrate our solution approach for the simplest case where only the second term is present i.e., 1

0 0P − = and all the measurements are equally weighted ( 1R R I−= = ). This corresponds to the Least-Squares solution presented in Section 2.1. In this case, the objective function is given by:

( )2

,

ˆ( ) ( )

i

T T Tji i j

i jj N

J y A x y A x O τ τ∈

= − − = − +∑

Differentiating J with respect to each one of the coordinates iτ leads to:

( ) ( ) ( )ˆ ˆ2 2 1 0i i i

Ti ji i j i ji ji

j N j N j Ni

J AA x A y O Oτ τ τ ττ ∈ ∈ ∈

⎡ ⎤∂= − = − − + = − − ⋅ + + =⎢ ⎥∂ ⎣ ⎦

∑ ∑ ∑

27

Let us substitute the following relations:

( )ˆ

i

i

Ti i ji

j N

i jij N

AA x N

A y O

τ τ∈

∈

⎧ = −⎪⎨

=⎪⎩

∑

∑

( )ˆ 0i

i i ji jj Ni

J N Oτ ττ ∈

∂= − + =

∂ ∑

From this, we get:

( )1 ˆi

i ji jj Ni

ON

τ τ∈

= +∑ (5.3.1)

The above equation must be satisfied by the optimal solution of the offset estimation problem. While this is a set of linear equations, a direct solution cannot be carried out in a decentralized manner. Instead, we will implement a decentralized iterative algorithm and show its convergence to the optimal centralized solution. This algorithm follows the Jacobi iteration that was described in Section 2.1. We will define the iterative procedure for the general case at the end of this section and we will prove its convergence in the next chapter. The above equation has a very simple interpretation. Each node computes its offset estimate as the average of all its neighbors' estimates plus the corresponding relative measurements. This procedure is the same as in [41, 12] and one can easily show that this is equivalent to the algorithm in [14]. Our objective is to extend the previous result to a wider framework and we will obtain this procedure as a special case of a more general algorithm. b) Weighted Least-Squares Now, we incorporate the weighting matrix while assuming that 1R− is a diagonal and positive definite matrix. These assumptions are reasonable since the matrix R represents the covariance of the uncorrelated measurement noise.

( )

( ) ( ) ( )

( )

21

,

1 1

1 ˆ( ) ( )

1 ˆ2 0

1 1 ˆ 0

i

i

i i

T T Tji i j

i j jij N

Tji i ji i

j Ni ji

i ji jj N j Nji ji

J y A x R y A x Or

J AR A x AR y Or

Or r

τ τ

τ ττ

τ τ

−

∈

− −

∈

∈ ∈

= − − = − +

∂= − = − − + =

∂

⋅ − + =

∑

∑

∑ ∑

From this, we get: ( )1 1 ˆ1

i

i

i ji jj N ji

j N ji

Or

r

τ τ∈

∈

= ⋅ +∑∑

(5.3.2)

28

We observe that the above formula is quite logical and consistent with our expectations namely, each measurement is multiplied by a weight equal to ( ) 1

jir−

according to its

quality. Moreover, we can check easily that if , . ., 1 ,jk jR I i e r j k N= = ∀ ∈ the result

reduces to the previous LS case. The coefficients ( ) 1

jir−

are related to the measurement ˆjiO ,

therefore ( ) ( )1 1

ji ijr r− −= since each measurement is issued by a bidirectional exchange

between the pair of nodes. Let us rewrite the previous equation in the following way:

( )1 1 1 1ˆ ˆi i i i

i ji j ji jj N j N j N j Nji ji ji ji

O Or r r r

τ τ τ∈ ∈ ∈ ∈

⎛ ⎞⋅ = + = +⎜ ⎟⎜ ⎟

⎝ ⎠∑ ∑ ∑ ∑

The above equation can be solved iteratively using a decentralized version of the Jacobi method (coordinate-wise), and converges to the optimal centralized solution (see the proof in [3]). Next, we will consider the general framework that includes the initial covariance matrix 0P in the objective function. The analysis is divided in two cases: diagonal and non-diagonal initial covariance matrix. c) General Framework: Diagonal 0P Finally, let us solve the original problem where the objective function is composed of two distinct terms:


Now,

( ) ( ) ( ) ( )

( ) ( ) ( )

1 1 1 10 0 0* *

1 10 0 0* *

0

1 1 ˆ ˆ 0i i

T

i i i ii

i ji j i ij N j Nji ji

J AR A x AR y P x P x

O P x P xr r

τ

τ τ

− − − −

− −

∈ ∈

∂= − + − =

∂

− + + − =∑ ∑

Here, ( )1

0 *iP − is the i-th row of the matrix 1

0P − . One can make the logical assumption that the initial inverse covariance matrix 1

0P − is a diagonal matrix. Indeed, 1

0P − represents the initial correlation between the different clocks in the network, and there is no reason to have some a-priori knowledge of the cross correlation terms but only on the variances of each clock (diagonal terms). In the case where 1

0P − is a diagonal matrix, we can obtain:

( )1 1 1 1ˆ (0)i i

i ji j ij N j Nji i ji i

Or p r p

τ τ τ∈ ∈

⎛ ⎞+ = + + ⋅⎜ ⎟⎜ ⎟

⎝ ⎠∑ ∑

29

This implies:

( ) (0)1 1 ˆ1 1 i

i

ii ji j

j N ji i

j N ji i

Or p

r p

ττ τ∈

∈

⎡ ⎤= ⋅ + +⎢ ⎥⎛ ⎞ ⎢ ⎥⎣ ⎦+⎜ ⎟⎜ ⎟⎝ ⎠

∑∑

(5.3.3)

In other words, iτ is given by the weighting average between the adjacent measurements and the initial knowledge related to it. We can notice that if the matrix 1

0P − is identically equal to zero, we obtain the same equation as the previous Weighted Least-Squares case (5.3.2). d) General Framework: Non-Diagonal 0P In general, the initial covariance matrix 0P need not be a diagonal matrix. As we will explain at the end of Section 6, even if the initial covariance matrix 0P is chosen to be diagonal, after the first iteration of the Kalman Filter, the inverse covariance matrix will not preserve its diagonal structure. Hence, if we have a-priori knowledge of the system or if multiple sets of measurements are available, we must consider the case in which the covariance matrix is not assumed to be diagonal. In this more general case, we get:

( ) ( ) ( )

( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

10

1

1 10 0

1

1 1 10 0 0

1

1 1 ˆ (0) 0

1 1 ˆ (0) (0) 0

1 1 ˆ (0) (

i i

i i

i i

N

i ji j k kikj N j N ki ji ji

N

i ji j k k i iik iij N j N kji ji

k i

N

i ji j i k kii ii ikj N j N kji ji

k i

J O Pr r

O P Pr r

P O P Pr r

τ τ τ ττ

τ τ τ τ τ τ

τ τ τ τ τ

−

∈ ∈ =

− −

∈ ∈ =≠

− − −

∈ ∈ =≠

∂= − + + − =

∂

− + + − + − =

⎛ ⎞+ = + + − −⎜ ⎟⎜ ⎟

⎝ ⎠

∑ ∑ ∑

∑ ∑ ∑

∑ ∑ ∑ ( )0)

This implies:

( )( ) ( ) ( ) ( )1 1

0 011

0

1 1 ˆ (0) (0)1 i

i

N

i ji j i k kii ikj N kji

k iii

j N ji

O P Pr

Pr

τ τ τ τ τ− −

∈ =−≠

∈

⎡ ⎤⎢ ⎥= + + − −⎢ ⎥⎛ ⎞⎢ ⎥+ ⎣ ⎦⎜ ⎟⎜ ⎟

⎝ ⎠

∑ ∑∑

(5.3.4)

The main problem in equation (5.3.4) is that each node needs to communicate with all the other nodes and not only with its neighbors. Thus, in the case where the matrix 0P is not diagonal, each node has to know the global topology of the entire network. As we previously explained, the initial covariance matrix 0P can be assumed to be diagonal. However, after applying the Kalman Filter equations, the covariance matrix ( | )P k k will not be diagonal anymore.

30

In summary, we have presented the optimality equations for the four different cases. In this section, we only considered the case with a single set of measurements. In order to implement the optimality equations, we propose an iterative decentralized algorithm presented next. 5.4 Decentralized Algorithm: Iterative Equations The four decentralized optimality equations we previously presented can be applied to a network in order to estimate the clock offsets at each node with respect to the reference time. The suggested distributed optimization is iterative. There are many iterative methods that can be used [43, 22]. Each iterative algorithm can be implemented either in a synchronous manner or in an asynchronous way, see [5] for more details about parallel and distributed computations. In the remainder of this work, we focus on the synchronous versions of the different algorithms. Convergence can be accelerated by over-relaxation techniques which are standard in numerical analysis [5]. In this research thesis, we will not be concerned with the number of iterations and rate of convergence, as long as convergence is achieved after an infinite number of iterations. As we previously mentioned, if only one set of measurements is available, we can apply the algorithm assuming that 0P is a diagonal matrix. The reason is that at the beginning, it is reasonable to assume that the a-priori covariance matrix is diagonal, i.e., the initial offsets are uncorrelated. In Section 7, we will present a recursive version of the previous algorithms for the multiple measurement case. For the single measurement case, the corresponding decentralized synchronous algorithm that implements the optimal equation in (5.3.3) is given by (assuming that 0P is diagonal):

( )( 1) ( ) (0)1 1 ˆˆ ˆ (5.4.1)1 1 i

i

k k ii ji j

j N ji i

j N ji i

Or p

r p

ττ τ+

∈

∈

⎡ ⎤= ⋅ + +⎢ ⎥⎛ ⎞ ⎢ ⎥⎣ ⎦+⎜ ⎟⎜ ⎟⎝ ⎠

∑∑

Initialization: (0)ˆ (0) 2,3,...i i i Nτ τ= = . Here, 0k ≥ is the iteration number. In other words, we obtained in (5.3.3) the optimal equation that computes the offsets at each node in a decentralized fashion. In order to implement this optimal equation, we propose the iterative synchronous algorithm in (5.4.1). The procedure in (5.4.1) requires only local broadcasts (communication with neighbors) and as we will show in the next section, converges to the optimal centralized solution (after an infinite number of iterations in all the nodes). Let us summarize the procedure for applying the algorithm in (5.4.1). First of all, we assume that the nodes detect their neighbors and exchange bilateral packets to obtain their relative measurements ˆ

jiO . In addition, they exchange their relative inverse covariances

( ) 1

jir−

and the inverse initial covariance ( ) 1ip − . The initial offset (0)iτ are known at each

node, so the quantity 1 1

ij N ji ir p∈

+∑ can be computed at the beginning of the procedure.

31

After the deployment of the network, the reference node initializes its estimate to 0 and never changes it. Every other node in the network initializes its estimate to an arbitrary value. At the start of iteration 1k + , each node sends its most recent estimate ( )ˆ k

iτ to its neighbors along with the corresponding iteration number. It also gathers the estimates of its neighbors, ( )ˆ k

j j iNτ Λ ∈ and then updates its own estimate by applying the above equation

for ( 1)ˆ kiτ

+ . The algorithm is summarized in Table 5.1.

32

Name: Decentralized Synchronous Kalman Filter Assumptions: - The inverse initial covariance matrix 1

0P − is diagonal. - Only one set of measurements is available.

Goal: Compute in a decentralized manner the offsets estimates at each network node (except for the reference) that approach the optimal centralized estimates.

Initialization: ( )1 0k kτ = ∀ , (0)

iτ is arbitrary for { }\ 1i V∈ . After deployment, each node { }\ 1i V∈ performs:

1. Detect its neighbors iN .

2. Identify the inverse initial covariance 1

ip and the initial offset (0)iτ .

3. Obtain one set of relative measurements ˆjiO and the associated inverse covariances

1

jir for every ij N∈ . Compute 1 1

ij N ji ir p∈

+∑ .

At every iteration k , each node iΛ performs:

4. Send ( )ˆ kiτ and k to its neighbors ij N∈ . Obtain ( )ˆ k

j ij Nτ ∈ .

5. Compute ( 1)ˆ kiτ

+ from the previous quantities, using (5.4.1).

Table 5.1. Summary of the Decentralized Synchronous Kalman Filter Algorithm.

The end of the algorithm is determined according to a termination condition on the absolute difference between the estimates at two successive iterations:

{ }( 1) ( )ˆ ˆ \ 1k ki i i Nτ τ ε+ − < ∈ (5.4.2)

In the next section, we will prove the convergence of the previous iterative decentralized algorithm to the optimal centralized solution.

33

6. Convergence Analysis Let us recall that the clock synchronization problem considered in this thesis (with a single set of measurements) can be summarized into the following LS optimization problem:


In order to find the optimal solution, one has to minimize the above objective function with respect to the vector x . In the previous chapter, we divided the analysis into four special cases and we obtained the optimal solution for each case. Then, one can implement the optimal solution using an iterative decentralized algorithm (equivalent to the Jacobi method). We now prove the convergence of the previous decentralized clock synchronization algorithms to the optimal centralized solution. We consider the most general case where the initial covariance matrix 0P is not assumed to be a diagonal matrix. In this case, the synchronous decentralized algorithm is given by:

( )( ) ( ) ( ) ( )( 1) ( ) 1 1 ( )

0 011

0

1 1 ˆˆ ˆ ˆ(0) (0)1 i

i

Nk k k

i ji j i m mii imj N mji

m iii

j N ji

O P Pr

Pr

τ τ τ τ τ+ − −

∈ =−≠

∈

⎡ ⎤⎢ ⎥= + + − −⎢ ⎥⎛ ⎞⎢ ⎥+ ⎣ ⎦⎜ ⎟⎜ ⎟

⎝ ⎠

∑ ∑∑

(6.1)

Initialization: (0)ˆ (1) (0) 2,3,...i i i Nτ τ= = . Here, 0k ≥ is the iteration number. Theorem 6.1 Suppose that:

a) A single set of measurements is available. b) The matrix R is diagonal and PSD, that is: ( ) 1

0 ,jir i j−

∞ > ≥ ∀ . c) The offsets are time-invariant. d) The initial state vector 0x is known. e) The initial covariance matrix 0P is known and is an M-matrix, namely:

( )

( )( ) ( )

10

10

10

0

0

0

ijj

ii

ij

P

P

P i j

−

−

−

⎧ ≥⎪⎪

≥⎨⎪

≤ ≠⎪⎩

∑

(6.2)

f) The clock adjustment operation in (6.1) is applied synchronously by all nodes ( 2,3,...i N= ) in all iterations.

Then, the iterated estimators ( )ˆ ( ) 2,3,...k

i n i Nτ = converge (as k →∞ ) to the optimal offsets that minimize the objective function:


namely, the set of offsets that would have been obtained by performing the centralized optimal protocol.

34

In the remainder of this section, we present the proof of this result. For the first two special cases (Least-Squares and Weighted Least-Squares), the convergence can be proven in the following way. The first step consists of proving that the objective function to be minimized ( )J τ is non-increasing in time. The second step is to show that if the clock adjustment operation is applied by all nodes in all iterations, the set of estimated offsets converges to the set of offsets that minimize the objective function i.e., the set of offsets that would have been obtained by performing the centralized protocol (for the LS case, see the proof in [14]). For the WLS algorithm the convergence condition is that: ( ) 1

0 ,jir i j−

∞ > ≥ ∀ or in other words, a sufficient condition is that the matrix R is diagonal and PSD (Positive Semi-Definite). Here, we will use another more convenient technique to prove the convergence of the proposed algorithm, in the general case. Let us recall that the general objective function is given by:


Let us analyze the convergence properties of the general case, where 0P is not necessarily assumed to be a diagonal matrix. We note that the iteration (6.1) cannot be easily decentralized when 0P is not diagonal as we explained in Section 5. However, the iteration is still well defined mathematically. The synchronous iteration can be written in vector form: ( ) ( )1( 1) ( ) 1 ( ) 1 1 1 ( )

0 0 0ˆ ˆ ˆ ˆk k T k kD P AR A AR y P x Pτ τ τ τ−+ − − − −= − + − − + (6.3)

Here: ( )1

11

0i

ij j N ji

i jD r

otherwise

−

∈

⎧ =⎪⎪= ⎨⎪⎪⎩

∑ and ( ) ( )10

00

ijij

P i jP

otherwise

−⎧ =⎪= ⎨⎪⎩

The optimal solution (equivalent to perform the centralized protocol) is the same as before:

( ) ( )1* 1 1 1 1

0 0 0TAR A P AR y P xτ

−− − − −= + +

Let us define: ( ) ( ) *ˆk kτ τ τ− (6.4) Then we get:

( ) ( )( ) ( )

( ) ( ) ( ) ( )

( ) ( )( )

1( 1) ( 1) * ( ) 1 ( ) 1 1 1 ( )0 0 0 0

11 1 1 10 0 0

1 1( 1) 1 1 ( ) 1 1 1 10 0 0 0 0

1 11 1 1 1 10 0 0 0

ˆ ˆ ˆ ˆ

ˆ

k k k T k k

T

k T k T

T T

D P AR A AR y P x P

AR A P AR y P x

I D P AR A P AR A P AR y P x

D P AR A P AR A P AR y P

τ τ τ τ τ τ

τ τ

−+ + − − − −

−− − − −

− −+ − − − − − −

− −− − − − −

− = − + − − + −

− + +

⎡ ⎤= − + + − + + +⎢ ⎥⎣ ⎦

+ + + + +( )

( ) ( )*

10

1( 1) 1 1 ( )0 0

k T k

x

I D P AR A P

τ

τ τ

−

−+ − −⎡ ⎤= − + +⎢ ⎥⎣ ⎦

35

We denote:

( ) ( )1 1 10 0

TM I D P AR A P− − −− + + (6.4)

We have the equivalent iteration:

( 1) ( )k kMτ τ+ = Thus, the convergence of the sequence ( )ˆ kτ to *τ is equivalent to the convergence of ( )kτ to the zero vector, which is determined by the matrix M given in (6.4). The necessary and sufficient condition for this to happen is that the spectral radius of M is strictly smaller than 1. According to proposition 2.5, the above iteration equation converges to zero if the sufficient conditions apply for the matrix iteration M . In other words, the row sums of the matrix M are less or equal than 1 (and at least in one row this sum is strictly smaller than 1), all the entries of the matrix M are non-negative and that the matrix M is irreducible; i.e., we can move from any node to any other node by a direct trajectory. In our network model, this last condition that the matrix M is irreducible requires the assumption that the graph that corresponds to the network is connected, namely there exists a path between any pair of nodes in the network. Some additional important assumptions are that each node has at least one neighbor (not including the reference node) and that the links are bidirectional (symmetric). The structure of the matrix M can be determined by inspection as the following:

( )

( )

( )( )

10

10

10

10

1

,1

0

1

i

i

ijji

iij N ji

ii ij

ij

iij N ji

Pr

i j and i j are neighborsP

rM MP

otherwiseP

r

−

−

∈

−

−

∈

⎧ − +⎪⎪ ≠⎪ +⎪

= = ⎨⎪ −⎪⎪

+⎪⎩

∑

∑

In order to show that the spectral radius of the iteration matrix is strictly smaller than 1, we will require that the matrix M is both non-negative and sub-stochastic (see Section 2.5). Let us find the conditions for the row sums of the matrix M to be smaller or equal to 1:

( ) ( )

( )

( )

( )

1 10 0

10

10

10

1

1

1

11

i i

i

i

i

ij ijj N j Nji

j iij

jii

j N ji

ijj N j iji

iij N ji

P Pr

MP

r

Pr

Pr

− −

∈ ∉≠

−

∈

−

∈ ∀ ≠

−

∈

⎡ ⎤− + −⎢ ⎥⎢ ⎥⎣ ⎦

= =+

−= ≤

+

∑ ∑∑

∑

∑ ∑

∑

36

From this we get:

( ) ( )

( ) ( )

1 10 0

1 10 0

1 1

0i i

ij iij N j i j Nji ji

ii ijj i

P Pr r

P P

− −

∈ ≠ ∈

− −

≠

− ≤ +

+ ≥

∑ ∑ ∑

∑

This implies: ( )10 0

ijj

P − ≥∑

In other words, we obtained that the necessary condition is that for each node iΛ , the row

sum of the matrix 10P − has to be non-negative. In addition, the condition: ( ) 1

0 ,jir i j−

∞ > ≥ ∀

( ji ijr r= ) must hold too. Requiring that all the entries of the matrix M are non-negative leads to:

( )( ) ( )

10

10

0

0ii

ij

P

P i j

−

−

⎧ ≥⎪⎨

≤ ≠⎪⎩

Hence, we can write:

( ) ( )1 10 0 0

ii ijj i

P P− −

∀ ≠

≥ − ≥∑

The above requirement can be seen as a diagonal dominance condition over the matrix 1

0P − . In the case that the node iΛ is adjacent to the reference node, the corresponding row sum of the M matrix is given by:

( ) ( ) ( )

( )

( ) ( )

( )

1 1 10 0 01

1

10

1 10 0 1

1

10

1 1

1

1 1

11

i i

i

i

i

ij i ijj N j Nji i

j i

iij N ji

ij ij N j iji i

iij N ji

P P Pr r

Pr

P Pr r

Pr

− − −

∈ ∉≠

−

∈

− −

∈ ∀ ≠

−

∈

⎡ ⎤ ⎡ ⎤− + − − + −⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎣ ⎦⎣ ⎦

=+

⎡ ⎤ ⎡ ⎤− − −⎢ ⎥ ⎢ ⎥

⎢ ⎥ ⎣ ⎦⎣ ⎦= <+

∑ ∑

∑

∑ ∑

∑

37

In the case that the node iΛ is not adjacent to the reference node, the corresponding row sum of the M matrix is given by:

( ) ( ) ( )

( )

( ) ( )

( )

1 1 10 0 0 1

10

1 10 0 1

10

1

1

1

11

i i

i

i

i

ij ij ij N j Nji

j i

iij N ji

ij ij N j iji

iij N ji

P P Pr

Pr

P Pr

Pr

− − −

∈ ∉≠

−

∈

− −

∈ ∀ ≠

−

∈

⎡ ⎤− + + − −⎢ ⎥⎢ ⎥⎣ ⎦

=+

⎡ ⎤− −⎢ ⎥

⎢ ⎥⎣ ⎦= <+

∑ ∑

∑

∑ ∑

∑

Hence, we have shown that at least in one row, the row sum of M is strictly smaller than 1. Actually, we proved that the iteration matrix verifies all the sufficient conditions for convergence. Namely, the row sums of the matrix M are less or equal than 1 (and at least in one row this sum is strictly smaller than 1), the matrix M is irreducible and all its entries are non-negative. ■ As a result, we proved the convergence of the decentralized algorithm to the optimal solution performed by the centralized Kalman Filter for the most general case. Now, we will conclude the same for the other methods as special cases of the previous general framework. If the matrix 0P is assumed to be diagonal and PSD, the decentralized algorithm converges. To see that, we have just to substitute 0ijp i j= ≠ and to check that

the condition ( ) 1Mρ < is still verified . If 0 0P = , we will obtain the Weighted-Least-Squares case, and it is easy to check that in this case too, the convergence of the decentralized algorithm is achieved. The proof for the most basic case (equivalent to LS) is provided in the literature by Giridhar et al. [12] but all the other cases are to the best of our knowledge original. To sum up, the convergence conditions are given by:

( )( )

( )( ) ( )

1

10

10

10

is diagonal and PSD, that is: 0 ,

0

0

0

ji

ijj

ii

ij

R r i j

P

P

P i j

−

−

−

−

⎧ ∞ > ≥ ∀⎪⎪ ≥⎪⎨⎪ ≥⎪⎪ ≤ ≠⎩

∑

We end this section by an additional lemma that will be of importance in the next chapter.

38

Lemma 6.1 The convergence conditions have to be checked only once at the beginning of the procedure. In other words, the Kalman Filter operations preserve the convergence properties, that is: if 0P is an M-matrix, then nP is an M-matrix for all 1n ≥ . Proof In the information form of the discrete time Kalman Filter, the measurement update equation of the inverse covariance matrix is given by:

( ) ( )1 1 1( 1| ) ( | ) TP n n P n n H R H− − −+ = + (6.5) In our case TH A= , hence we obtain: ( ) ( )1 1 1( 1| ) ( | ) TP n n P n n AR A− − −+ = + .

Recalling that the matrix 1R− is assumed to be diagonal, let us analyze the properties of the matrix 1 TAR A− . For the regular incidence matrix, we have:

1

1 0. .. .1 0

TAR A−

⎛ ⎞ ⎛ ⎞⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟=⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠

and for the reduced incidence matrix we have:

1

1..1

TAR A v−

⎛ ⎞⎜ ⎟⎜ ⎟ =⎜ ⎟⎜ ⎟⎝ ⎠

Here, v is a vector with non-negative components. The structure of the matrix 1 TAR A− is as follows:

( )( )

1

1

0

0

T

ii

T

ij

AR A

AR A

−

−

⎧ >⎪⎨

≤⎪⎩

The row sums are ( )1 0T

ijj

AR A−

∀

=∑ for each node iΛ that is not adjacent to the reference

node. Besides, if the node iΛ is adjacent to the reference node, this sum is a strictly positive number. Hence, we can draw the conclusion that if the a-priori inverse covariance matrix ( ) 1( | )P n n − verifies the convergence conditions, the a-posteriori inverse covariance

matrix ( ) 1( 1| )P n n −+ will verify them too. As a consequence, we have to check these conditions only once at the beginning of the procedure. This result will be useful in the next section. ■ In the next section, we extend the analysis to the case of multiple measurement sets and we propose an optimal decentralized recursive algorithm.

39

7. Recursive Algorithms: Multiple Measurement Update In the previous sections, only the case of one measurement update was investigated. The next step is to consider the multiple measurement case and to present a recursive version of the previous decentralized algorithms. Several sets of measurements become successively available and our goal is to develop an algorithm that estimates the offsets after each set of measurements. The first alternative is to solve this problem by proposing a recursive algorithm. The latter is required to be decentralized and to depend only on the last measurement and on the previous estimates. The second option to solve the multiple measurement case is to wait for all the measurements and then to perform the estimation procedure. We will consider this case at the end of this section by proposing a decentralized non-recursive algorithm. In the subsequent analysis, we still focus on the case where the initial covariance matrix 0P is diagonal. The main problem is that after the first measurement update in the KF equations, the covariance matrix is not diagonal anymore. Then, each node has to communicate with all the other nodes over the network and not only with the one-hop neighbors. First, we present the centralized Kalman Filter algorithm and then we propose an optimal decentralized procedure that is equivalent to the KF solution. Afterward, we suggest a decentralized sub-optimal algorithm and a decentralized non-recursive method. 7.1. The Centralized KF Algorithm In Section 5.2, we obtained that the centralized Kalman Filter optimal solution (using the LS approach) for a single set of measurements is given by:

( ) ( )11 1 1 10 0 0ˆ Tx AR A P AR y P x

−− − − −= + +

We will now develop the corresponding recursive version given n sets of measurements. By repeating the previous derivation, we obtain:

( )

( )( )

11 1 1 10 0 0

1

111 1 1 10 0 0

1

ˆ( ) ( )

ˆ( 1) 1 ( )

nT

k

nT

k

x n n AR A P AR y k P x

x n n AR A P AR y k P x

−− − − −

=

−−− − − −

=

⎧ ⎛ ⎞= ⋅ + +⎪ ⎜ ⎟

⎪ ⎝ ⎠⎨

⎛ ⎞⎪ − = − ⋅ + +⎜ ⎟⎪ ⎝ ⎠⎩

∑

∑

This implies:

( ) ( )( )11 1 1 1 10 0ˆ ˆ( ) 1 ( 1) ( )T Tx n n AR A P n AR A P x n AR y n

−− − − − −⎡ ⎤= ⋅ + − ⋅ + − +⎣ ⎦ (7.1.1)

40

Hence, we obtained a recursive relation for the estimated vector of offsets. The equation in (7.1.1) corresponds to the centralized version of the recursive algorithm. We can simplify this equation in the following way:

( ) ( )( )( ) ( )

11 1 1 1 10 0

11 1 1 10

1

ˆ ˆ( ) 1 ( 1) ( )

ˆ ˆ ˆ( ) ( 1) ( ) ( 1)

ˆ ˆ ˆ( ) ( 1) ( | ) ( ) ( 1)

T T

T T

T

x n n AR A P n AR A P x n AR y n

x n x n n AR A P AR y n AR A x n

x n x n P n n AR y n A x n

−− − − − −

−− − − −

−

⎡ ⎤= ⋅ + − ⋅ + − +⎣ ⎦

⎡ ⎤= − + ⋅ + − −⎣ ⎦⎡ ⎤= − + − −⎣ ⎦

Let us now discuss several applications related to this centralized protocol. The centralized algorithm can be applied to a variety of situations. For example, one can imagine a set of fully intercommunicating nodes or the situation where a central unit is in charge. We can also apply the centralized protocol not solely on the entire network but also locally. Namely, we can estimate the offsets of a group of nodes that are regrouped geographically. For each group, we will assign a single processor that can communicate with all the group members. Another important application is the scenario in which a node needs to estimate simultaneously several different variables. For instance, in the sensor localization problem, each node iΛ has to estimate his position by evaluating the three coordinate ( )ˆ ˆ ˆ, ,i i ix y z . In this case, each node needs to estimate a vector of several variables so it is worthwhile to extend our algorithm to the vector case (centralized version). Similarly to the single measurement update, applying the centralized KF algorithm for networks with a large number of measurements, will be prohibitively expensive in terms of energy consumption, bandwidth and communication time. In other words, in order to compute the optimal estimate directly (in a centralized manner), one seems to need all the measurements associated with their error covariances and the topology of the entire network (because the covariance matrix ( | )P n n is non-diagonal). The solution we propose is a recursive decentralized algorithm based on the same LS approach as in the previous chapters. 7.2 The Optimal Decentralized Algorithm We first consider the two measurements case and then extend to the general framework of n sets of measurements. We start with the initial parameters ( )0 0,x P , where 0P is chosen to be a diagonal matrix. When the first set of measurements, say (1)y , arrives we can estimate the offset vector ˆ(1)τ and calculate the a-posteriori inverse covariance matrix

1(1)P − . This matrix will not be diagonal in general. The next step begins when the second set of measurements (2)y arrives. As before, we can estimate the offset vector ˆ(2)τ and calculate the a-posteriori inverse covariance matrix 1(2)P − . But here, as the matrix 1(1)P − is not diagonal, the synchronization procedure is more difficult than the previous one because the Jacobi method does not lead to a distributed algorithm. During the second step, each node has to communicate with the entire network and not only with its neighbors. This was discussed in Section 5.2. On the other hand, we can wait that the two sets of measurement ( )(1), (2)y y arrive and then estimate the offset vector ˆ(2)τ and calculate the

a-posteriori inverse covariance matrix 1(2)P − by applying the Kalman Filter equations to

41

the combined measurement vector. This one step method will give the same result as the two steps method, but the procedure is easier because 0P is diagonal and then the algorithm is decentralized. Our goal is to develop a scheme that combines the better features of both methods: namely, a recursive estimation algorithm that is still decentralized. Let us solve the problem when two sets of measurements are available. Then, the objective function is given by:

1 1 10 0 0( ) ( ) ( (1) ) ( (1) ) ( (2) ) ( (2) )T T T T T T TJ x x P x x y A x R y A x y A x R y A x− − −= − − + − − + − −

First, we compute the optimal offset estimate of node iΛ given the first set of measurements. When the matrix 0P is assumed to be diagonal and only one set of measurements is considered, we have obtained previously the following clock synchronization algorithm:

( )( 1) (1) ( ) (0)1 1 ˆˆ ˆ(1) (1) ; 2,3,...1 1 i

i

k k ii ji j

j N ji i

j N ji i

O i Nr p

r p

ττ τ+

∈

∈

⎡ ⎤= ⋅ + + =⎢ ⎥⎛ ⎞ ⎢ ⎥⎣ ⎦+⎜ ⎟⎜ ⎟⎝ ⎠

∑∑

(7.2.1)

Initialization: (0)ˆ (1) (0) 2,3,...i i i Nτ τ= = . Here, 0k ≥ is the iteration number. Equation (7.2.1) corresponds to a synchronous decentralized iterative algorithm. As we previously have shown, when the above procedure is applied by all nodes in all iterations, the set of offset estimates converges to the optimal centralized solution (if the matrix 1

0P − is chosen according to the conditions convergence). After repeating this algorithm an infinite number of iterations, we will obtain the optimal offset estimate ˆ (1)iτ . In other words, ˆ (1)iτ is the optimal offset estimate (after the convergence of the above algorithm) of the node iΛ given the set of measurements (1)y . The next step is to compute the optimal offset estimate of node iΛ given the sets of measurement ( )(1), (2)y y . By repeating the one-measurement derivation, we obtain:

( ) ( )( 1) (1) ( ) (2) ( )(0)1 1 1ˆ ˆˆ ˆ ˆ(2) (2) (2)1 12 i i

i

k k kii ji j ji j

j N j Nji i ji

j N ji i

O Or p r

r p

ττ τ τ+

∈ ∈

∈

⎡ ⎤= ⋅ + + + +⎢ ⎥

⎢ ⎥⎣ ⎦+∑ ∑

∑ (7.2.2)

Let us try to express ˆ (2)iτ as a function of ˆ (1)iτ (recursively). In equation (7.2.1), after an infinite number of iterations in all the nodes, we will get the following steady-state equations:

( )(1) (0)1 1 ˆˆ ˆ(1) (1)1 1 i

i

ii ji j

j N ji i

j N ji i

Or p

r p

ττ τ∈

∈

⎡ ⎤= ⋅ + +⎢ ⎥⎛ ⎞ ⎢ ⎥⎣ ⎦+⎜ ⎟⎜ ⎟⎝ ⎠

∑∑

42

Let us find an expression for (1)1 ˆ

i

jijij N

Or∈

∑ from the above equation:

(1) (0)1 1 1 1ˆ ˆ ˆ(1) (1)i i i

iji i j

ji ji i ji ij N j N j NO

r r p r pτ

τ τ∈ ∈ ∈

⎛ ⎞⎜ ⎟= + ⋅ − −⎜ ⎟⎝ ⎠

∑ ∑ ∑

Now, let us replace in equation (7.2.2):

( 1) (0)1 1 1 1ˆ ˆ ˆ(2) { (1) (1)1 12 i i

i

k ii i j

j N j Nji i ji i

j N ji i

r p r pr p

ττ τ τ+

∈ ∈

∈

⎛ ⎞= ⋅ + ⋅ − −⎜ ⎟⎜ ⎟

⎝ ⎠+∑ ∑

∑

( ) (0)1 ˆ (2)i

k ij

j N ji ir pττ

∈

+

+ +∑ ( )(2) ( )1 ˆ ˆ (2) }i

kji j

j N ji

Or

τ∈

+ +∑

From this, we get:

( ) ( )( 1) (2) ( )ˆ (1)1 1 1 ˆˆ ˆ ˆ ˆ(2) (1) (1) 2 (2)1 12 i i

i

k kii i j ji j

j N j Nji i ji

j N ji i

Or p r

r p

ττ τ τ τ+

∈ ∈

∈

⎡ ⎤= ⋅ − + + +⎢ ⎥

⎢ ⎥⎣ ⎦+∑ ∑

∑

which is the required expression. Now, let us generalize this idea for n sets of measurements, i.e., to find a recursive relation between ( 1)ˆ ( )k

i nτ + and ˆ ( 1)i nτ − . We consider the general case in which the matrix 1R− may be different for each set of measurements. Hence, the objective function is given by:

1 10 0 0

1( ) ( ) ( ( ) ) ( ) ( ( ) )

nT T T T

kJ x x P x x y k A x R k y k A x− −

=

= − − + − −∑ (7.2.3) First, differentiate the objective function (7.2.3) with respect to the offsets and set the partial derivatives to zero:

( ) ( ) ( )( ) 1 10 0 0* *

1 1

1 1 ˆ 0( ) ( )

i i

T nk

i ji j i ik j N k j Ni ji ji

J O P x P xr k r k

τ ττ

− −

= ∈ = ∈

⎡ ⎤ ⎡ ⎤∂= ⋅ − + + − =⎢ ⎥ ⎢ ⎥

∂ ⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦∑∑ ∑ ∑

As usual, we assume that the initial covariance matrix 0P is diagonal and obtain the corresponding iterative algorithm with combined measurements:

( )( 1) ( ) ( )

1

1

(0)1 1 ˆˆ ˆ( ) ( )( )1 1

( )i

i

nk k k i

i ji jn k j N ji i

k j N ji i

n O nr k p

r k p

ττ τ+

= ∈

= ∈

⎧ ⎫⎡ ⎤⎪ ⎪= + +⎢ ⎥⎨ ⎬⎡ ⎤ ⎢ ⎥⎪ ⎪⎣ ⎦⎩ ⎭+⎢ ⎥⎢ ⎥⎣ ⎦

∑ ∑∑ ∑

Initialization: (0)ˆ ( ) (0) 2,3,...i in i Nτ τ= = . Here, 0k ≥ is the iteration number.

43

The above formula corresponds to a synchronous decentralized iterative algorithm. As we previously have shown, when the above procedure is applied by all nodes in all iterations, the set of offset estimates converges to the optimal centralized solution. The same equation holds for 1n − . After an infinite number of iterations in all the nodes, we will get the following steady-state equations:

( )1

( )

1 1

1

(0)1 1 ˆˆ ˆ( 1) ( 1)( )1 1

( )i

i

nk i

i ji jn k j N ji i

k j N ji i

n O nr k p

r k p

ττ τ−

−= ∈

= ∈

⎧ ⎫⎡ ⎤⎪ ⎪− = ⋅ + − +⎢ ⎥⎨ ⎬⎡ ⎤ ⎢ ⎥⎪ ⎪⎣ ⎦⎩ ⎭+⎢ ⎥⎢ ⎥⎣ ⎦

∑ ∑∑ ∑

As before, let us find an expression for ( 1)1 ˆ( 1)

i

nji

jij NO

r n−

∈ −∑ from the above equation:

( )

1( 1)

1

2( )

1

(0)1 1 1ˆ ˆ ( 1)( 1) ( )

1 1ˆ ˆ ˆ( 1) ( 1)( ) ( 1)

i i

i i

nn i

ji iji ji i ij N k j N

nk

ji j jji jik j N j N

O nr n r k p p

O n nr k r n

ττ

τ τ

−−

∈ = ∈

−

= ∈ ∈

⎡ ⎤⎡ ⎤⎢ ⎥⎢ ⎥= + ⋅ − − −

− ⎢ ⎥⎢ ⎥⎣ ⎦⎣ ⎦⎡ ⎤⎢ ⎥+ − − −

−⎢ ⎥⎣ ⎦

∑ ∑ ∑

∑ ∑ ∑

Now, replace the above expression in the equation of ( 1)ˆ ( )k

i nτ + :

( 1) ( )

1

1 1 ˆˆ ( ) {( )1 1

( )i

i

k ki jin j N ji

k j N ji i

n Or k

r k p

τ +

∈

= ∈

= ⋅⎡ ⎤

+⎢ ⎥⎢ ⎥⎣ ⎦

∑∑ ∑

( )2( )

1

(0)ˆ ( )n

k ij

k i

np

ττ−

=

⎡ ⎤+ +⎢ ⎥

⎢ ⎥⎣ ⎦∑ ( )( )

1( )

1

1 ˆ ( )( )

1 1 1 ˆˆ ( 1)( ) ( )

i

i i

kj

j N ji

nk

i jik j N j Nji i ji

nr k

n Or k p r k

τ

τ

∈

−

= ∈ ∈

+ +

⎡ ⎤⎛ ⎞+ + ⋅ − −⎢ ⎥⎜ ⎟⎜ ⎟⎢ ⎥⎝ ⎠⎣ ⎦

∑

∑ ∑ ∑ ( )2

1

ˆ ( 1)

(0)

n

jk

i

i

n

p

τ

τ

−

=

⎡ ⎤+ − −⎢ ⎥

⎢ ⎥⎣ ⎦

−

∑

( )( ) ( )1 1 ˆˆ ˆ( 1) ( ) }( ) ( )

i i

n kj ji j

j N j Nji ji

n O nr k r k

τ τ∈ ∈

− − + +∑ ∑

From this, we get:

( )

( )

1( 1) ( )

1

1

( ) ( )

1 1ˆ ˆ ˆ ˆ( ) { ( 1) ( 1) ( )( )1 1

( )

ˆ ( 1)1 ˆ ˆ ( ) }( )

i

i

i

nk k

i i j jn k j N ji

k j N ji i

n k iji j

j N ji i

n n n nr k

r k p

nO nr n p

τ τ τ τ

ττ

−+

= ∈

= ∈

∈

⎛ ⎞⎡ ⎤= ⋅ − − − + +⎜ ⎟⎣ ⎦⎜ ⎟⎡ ⎤ ⎝ ⎠+⎢ ⎥

⎢ ⎥⎣ ⎦−

+ + +

∑ ∑∑ ∑

∑

44

Now, let us continue to arrange the previous expression so that ( 1)ˆ ( )ki nτ + will be given by

the sum of ˆ ( 1)i nτ − and an additional term related to the last set of measurements in order to obtain a recursive relation.

( )

( )

( 1) ( ) ( )

1

1( )

1

1 1 ˆˆ ˆ ˆ ˆ( ) ( 1) { ( 1) ( )( )1 1

( )

1 ˆ ˆ( ) ( 1) }( )

i

i

i

k n ki i ji i jn j N ji

k j N ji i

nk

j jk j N ji

n n O n nr n

r k p

n nr k

τ τ τ τ

τ τ

+

∈

= ∈

−

= ∈

⎡ ⎤= − + − − − +⎣ ⎦⎡ ⎤+⎢ ⎥

⎢ ⎥⎣ ⎦⎡ ⎤

+ − −⎢ ⎥⎢ ⎥⎣ ⎦

∑∑ ∑

∑ ∑

( )

( )

( 1) ( )

1

( )

1

1 1 ˆˆ ˆ ˆ ˆ( ) ( 1) { ( 1) ( 1)( )1 1

( )

1 ˆ ˆ( ) ( 1) }( )

i

i

i

k ni i ji i jn j N ji

j N k ji i

nk

j jj N k ji

the estimated measurement

n n O n nr T

r k p

n nr k

τ τ τ τ

τ τ

+

∈

∈ =

∈ =

⎡ ⎤⎢ ⎥= − + ⋅ − − − − +⎢ ⎥⎡ ⎤ ⎢ ⎥+⎢ ⎥ ⎣ ⎦

⎢ ⎥⎣ ⎦⎡ ⎤

+ − −⎢ ⎥⎢ ⎥⎣ ⎦

∑∑ ∑

∑ ∑

For the special case where the matrix 1R− is similar for each set of measurements, we obtain:

( ) ( )( 1) ( ) ( )1 1 ˆˆ ˆ ˆ ˆ ˆ ˆ( ) ( 1) ( 1) ( 1) ( ) ( 1)1 1i

i

k n ki i ji i j j j

j N ji

j N ji ithe estimated measurement

n n O n n n n nrn

r p

τ τ τ τ τ τ+

∈

∈

⎧ ⎫⎡ ⎤⎪ ⎪⎢ ⎥= − + ⋅ − − − − + − −⎨ ⎬⎢ ⎥⎪ ⎪+ ⎢ ⎥⎣ ⎦⎩ ⎭

∑∑

The above algorithm is decentralized, recursive and iterative. For each set of measurements, it performs an infinite number of iterations in order to converge to the optimal solution. Moreover, ( 1)ˆ ( )k

i nτ + depends only on the last measurement and on the previous estimates. We point out that the last equation slightly deviates from the standard structure of a recursive algorithm due to the term n (time explicit index) in the denominator and in the internal term. Let us define the following quantity:

[ ] 1

1

1 1( )( )

i

n

ij N ki ji

I np r k

−

∈ =

⎡ ⎤= + ⎢ ⎥

⎢ ⎥⎣ ⎦∑ ∑

Hence, the following recursive relation holds:

[ ] [ ]

[ ] [ ]

1 1

1 1

1( ) ( 1)( )

1(0) (0)

i

i ij N ji

i ii

I n I nr n

I Pp

− −

∈

− −

⎧ = − +⎪⎪⎨⎪ = =⎪⎩

∑ (7.2.4)

45

We note that the elements of [ ] 1( )iI n − in (7.2.4) are the diagonal entries of the inverse covariance matrix in the Kalman Filter equations. Thus, the error estimation variances of the estimates at each step are obtained. This is a desirable property since it gives information on the estimation quality. Observe however that we do not compute the non-diagonal elements of the inverse covariance matrix. Using this notation, the recursive version of the decentralized algorithm in its final form is given by:

[ ]( ) ( ) ( )

[ ] [ ]

( 1) ( ) ( ) ( )1

1 1

1 1 1ˆˆ ˆ ˆ ˆ ˆ ˆ( ) ( 1) ( 1) ( ) 1 ( ) ( 1)( ) (7.2.5)

1 1 1( ) ( 1) 2,...,

i i

i i

k n k ki i ji i j j j

j N j Nji jii

i ij N j Nji i ji

n n O n n n n nr rI n

I n I n n i Nr p r

τ τ τ τ τ τ+−

∈ ∈

− −

∈ ∈

⎧ ⎫⎡ ⎤⎪ ⎪⎡ ⎤= − + ⋅ ⋅ − − − + − ⋅ ⋅ − −⎢ ⎥⎨ ⎬⎣ ⎦ ⎢ ⎥⎪ ⎪⎣ ⎦⎩ ⎭

= − + = + ⋅ =

∑ ∑

∑ ∑

where: [ ] [ ]1 1 1(0) (0)i ii

I Pp

− −= = .

We consider the case where the matrix 1R− is similar for each set of measurements for notation simplicity. Moreover, this assumption is quite logical and this scenario can be considered as the most representative case. The above set of equations in (7.2.5) correspond to our main algorithm that is summarized in Table 7.1. It is a decentralized, synchronous and recursive algorithm that computes at each step, the estimated offsets in addition to the corresponding error variances. The main advantage of this algorithm is its local nature; each network node needs to communicate only with its neighbors. We now describe in words the iterative procedure in (7.2.5). At time n , we assume that the estimate of ˆ ( 1)i nτ − is given. Then, ( )ˆ ( ) 1, 2,...k

i n kτ = is computed based on ˆ ( 1)i nτ − and the last measurement set ( )y n . We assume that a sufficient number of iterations is performed at each time n , so that the estimate ˆ ( )i nτ is accurate. The proposed algorithm in (7.2.5) may be derived in two ways, which lead to the same optimal equations: 1. Differentiate ( 1)J n − and ( )J n with respect to the offsets vector x and set the partial

derivatives to zero. 2. Algebraic manipulations of the standard recursive extension of (5.3.4), with the

following KF update inverse covariance equation:

( ) ( )1 1 11

Tk kP P AR A− − −+ = + (7.2.6)

In other words, the set of iterative equations in (7.2.5) is mathematically equivalent to perform (5.3.4) separately for each measurement set in addition to (7.2.6). This result will be useful in the proof of Theorem 7.1. This can be shown easily by some appropriate mathematical manipulations and is not presented. Now, let us show the convergence of the set of equations in (7.2.5) to the optimal centralized solution.

46

Theorem 7.1. Suppose that:

a) The matrix R is diagonal and Positive Semi-Definite, that is: ( ) 10 ,jir i j

−≤ < ∞ ∀ .

b) The initial covariance matrix 0P is an M-matrix, namely:

( )

( ) ( ) ( )

10

1 10 0

0

0 0

ijj

ii ij

P

P and P i j

−

− −

⎧ ≥⎪⎨⎪ ≥ ≤ ≠⎩

∑

c) The clock adjustment operation in (7.2.5) is applied synchronously by all nodes ( 2,3,...i N= ) in all iterations, recursively for n sets of measurements .

d) A sufficient number of iterations is performed after each measurement set n , so that ( )ˆ ( )k

i nτ converges to ˆ ( )i nτ . Then, for each 1n ≥ , the iterated estimators ( )ˆ ( ) 2,3,...k

i n i Nτ = converge (as k →∞ ) to the optimal offsets that minimize the objective function in (7.2.3). We next present the proof of Theorem 7.1. Our proof relies on Lemma 6.1. Lemma 6.1 immediately implies the convergence of the recursive extension (for several measurement sets) of equation (5.3.4) to the optimal solution, where at each step, the new covariance matrix is computed according to (7.2.6). Since the iterations in (7.2.5) are equivalent to the procedure in (5.3.4), we obtain the claimed convergence in Theorem 7.1.■ Remarks:

- We can also apply the recursive procedure to the case where at each stage, just some of the pair-wise offsets are measured (e.g., ( )1y is only the measurement of

34O ). In this case, the matrix TA for that stage will have a structure in accordance with the measurements.

- The main advantage of this recursive method is that we have the same diagonal

matrix 10P − during the whole procedure, thus it is easier to prove convergence.

Moreover, we do not need that each node communicates with all the other nodes but only with its neighbors. In fact, when a new set of measurement arrives, we exploit the estimate of the previous iteration and add only the information contained in the new measurement.

47

Name: Decentralized Synchronous Recursive Kalman Filter Assumptions: - The inverse initial covariance matrix 1

0P − is diagonal. - n sets of measurements are available. - The matrix 1R− is the same for each set of measurements.

Goal: Compute in a decentralized manner the offsets estimates at each network node (except for the reference) that approach the optimal centralized estimates.






8. Obtain the first set of relative measurements (1)ˆjiO and the associated inverse

covariances 1

jir for every ij N∈ . Compute [ ] 1 1 1(1)

i

ij N ji i

Ir p

−

∈

= +∑ .

9. Send (0)iτ to its neighbors ij N∈ . Obtain (0)j ij Nτ ∈ and keep in memory (0)iτ . At every time n that a new set of measurement arrives:

10. Compute recursively [ ] [ ]1 1 1( ) ( 1)i

i ij N ji

I n I nr

− −

∈

= − + ∑ and send n to every node.

11. At every iteration k , each node iΛ performs: a. Send ( )ˆ ( )k

i nτ and k to its neighbors ij N∈ . Obtain ( )ˆ ( )kj in j Nτ ∈ .

b. Compute ( 1)ˆ ( )ki nτ + from the previous quantities, using (7.2.5).

12. Send the final values of ˆ ( )i nτ to its neighbors ij N∈ . Obtain ˆ ( )j in j Nτ ∈ and keep in memory ˆ ( )i nτ .

Table 7.1. Summary of the Decentralized Synchronous Recursive KF Algorithm. So far, we developed the recursive decentralized algorithm for multiple sets of measurements using the Least-Squares approach. Next, we will show that one can obtain the same algorithm through the Kalman Filter equations.

48

7.3 Equivalence with the KF Equations We will show that the previous recursive algorithm can be obtained by applying the Kalman Filter equations through appropriate manipulations. By applying the information form of the Kalman Filter in our context, we obtained the following recursive equation (see in Section 5.1):

( )11 1 1ˆ ˆ ˆ( | ) ( 1| 1) ( 1| 1) ( ) ( 1| 1)T Tx n n x n n P n n Q AR A AR y n A x n n−− − −⎡ ⎤ ⎡ ⎤= − − + − − + + − − −⎣ ⎦⎣ ⎦

Now, we will develop a decentralized version of the above equation and as expected, we will show that this approach leads to the same recursive algorithm that we obtained in the last section. Let us recall that ( )1 2( ) 0, ,... T

Nx n τ τ τ= (where iτ is the offset of the node iΛ ) and that

[ ] 1 1(0)ii

Pp

− = (the initial inverse covariance matrix is diagonal). Moreover, we assume as

usual that the matrix R is diagonal. For the case where 0Q = (there is no process noise), due to the special structure of the matrices 1 TAR A− and 1AR− , one can write the previous equation for each node separately and obtain the following decentralized estimates:

( )( )1 1 1ˆˆ ˆ ˆ ˆ ˆ( ) ( 1) ( 1) ( ) ( 1) 2,...,1 1i i

i

ni i ji i j j

j N j Nji ji

j Ni ji

n n O n n n n i Nr rn

p r

τ τ τ τ τ∈ ∈

∈

⎧ ⎫⎡ ⎤⎪ ⎪⎡ ⎤= − + ⋅ ⋅ − − + ⋅ ⋅ − − =⎢ ⎥⎨ ⎬⎣ ⎦ ⎢ ⎥⎪ ⎪⎣ ⎦⎩ ⎭+ ⋅∑ ∑

∑

As before, we can define:

[ ] [ ]

[ ] [ ]

1 1

1 1

1 1 1( ) ( 1)

1(0) (0)

i i

i ij N j Nji i ji

i ii

I n I n nr p r

I Pp

− −

∈ ∈

− −

⎧ = − + = + ⋅⎪⎪⎨⎪ = =⎪⎩

∑ ∑

and then the recursive algorithm is given by:

[ ]( ) ( ) ( )

[ ] [ ]

( 1) ( ) ( ) ( )1

1 1

1 1 1ˆˆ ˆ ˆ ˆ ˆ ˆ( ) ( 1) ( 1) ( ) 1 ( ) ( 1)( )

1 1 1( ) ( 1) 2,...,

i i

i i

k n k ki i ji i j j j

j N j Nji jii

i ij N j Nji i ji

n n O n n n n nr rI n

I n I n n i Nr p r

τ τ τ τ τ τ+−

∈ ∈

− −

∈ ∈

⎧ ⎫⎡ ⎤⎪ ⎪⎡ ⎤= − + ⋅ ⋅ − − − + − ⋅ ⋅ − −⎢ ⎥⎨ ⎬⎣ ⎦ ⎢ ⎥⎪ ⎪⎣ ⎦⎩ ⎭

= − + = + ⋅ =

∑ ∑

∑ ∑

where: [ ] [ ]1 1 1(0) (0)i ii

I Pp

− −= = .

This is an iterative (synchronous) algorithm that computes recursively the estimated offset for each node iΛ ( 0k ≥ is the iteration number) in a decentralized manner. The above algorithm is exactly similar to the procedure that we obtained in (7.2.5). In summary, we have shown as expected that the decentralized recursive algorithm we developed is exactly equivalent to the Kalman Filter. Indeed, we solve the problem of

49

finding an optimal decentralized algorithm by two different approaches that lead to the same optimal solution. This is not a surprise because in the different derivations we are looking for the linear optimal estimate in the MMSE sense and the optimal solution is given by the Kalman Filter. The special structure of the centralized version of the Kalman Filter equations enables us to derive a decentralized version just with some mathematical manipulations. This is not always possible, as we will see for the case where a white noise process is incorporated in the equations. The problem of the Kalman Filter equations was the fact that the covariance matrix does not keep its diagonal initial structure. Hence, each node has to communicate with the entire network (or equivalently to make use of a central unit). In the decentralized version, the requirement is that each node communicates only with its one-hop neighbors. In addition, the terms [ ] 1( )iI n − are equal to the diagonal terms of the inverse covariance matrix. In other words, this algorithm computes the variance of each estimate, so we can know the quality of the estimation at each node. Next, we propose a simple sub-optimal algorithm that computes the offsets for the case where multiple sets of measurements are available. 7.4 A Sub-Optimal Decentralized Algorithm Another approach to solving the decentralized estimation problem can be considered. At the end of the section 5, we obtained a general decentralized equation in the case of a non-diagonal initial covariance matrix (for a single set of measurements):

( )( ) ( ) ( ) ( )( 1) ( ) 1 1 ( )

0 011

0

1 1 ˆˆ ˆ ˆ(0) (0) (7.4.1)1 i

i

Nk k k

i ji j i m mii imj N mji

m iii

j N ji

O P Pr

Pr

τ τ τ τ τ+ − −

∈ =−≠

∈

⎡ ⎤⎢ ⎥= + + − −⎢ ⎥⎛ ⎞⎢ ⎥+ ⎣ ⎦⎜ ⎟⎜ ⎟

⎝ ⎠

∑ ∑∑

As expected, the estimated offset of node iΛ depends on all the other offsets and not only those of its neighbors. One can think about the naïve sub-optimal algorithm that neglects the off-diagonal terms of the inverse covariance matrix:

( )( ) ( )( 1) ( ) 1

01

0

1 1 ˆˆ ˆ (0)1 i

i

k ki ji j iii

j N ji

iij N ji

O Pr

Pr

τ τ τ+ −

∈−

∈

⎡ ⎤= + +⎢ ⎥⎛ ⎞ ⎢ ⎥⎣ ⎦+⎜ ⎟⎜ ⎟⎝ ⎠

∑∑

We obtained a decentralized sub-optimal algorithm for the case where a single set of measurements is available. Let us generalize for the multiple measurement scenario. In this case, the centralized optimal algorithm is given by:

11 1 1ˆ ˆ ˆ( | ) ( 1| 1) ( 1| 1) ( ) ( 1| 1)T Tx n n x n n P n n AR A AR y n A x n n−− − −⎡ ⎤ ⎡ ⎤= − − + − − + − − −⎣ ⎦ ⎣ ⎦

If we neglect the off-diagonal terms of the inverse covariance matrix at time 1n − , we have (as usual, we assumed that the initial covariance matrix is diagonal):

( ) 11 1 1ˆ ˆ ˆ( | ) ( 1| 1) ( 1| 1) ( ) ( 1| 1)T Tx n n x n n diag P n n AR A AR y n A x n n−

− − −⎡ ⎤ ⎡ ⎤= − − + − − + − − −⎣ ⎦⎣ ⎦

( ) ( ) ( ) ( ) [ ] 11 10

1 1 1( 1| 1) 1 1 ( 1)i i

iiij N j Nji i ji

diag P n n P n n I nr p r

−− −

∈ ∈

− − = + − ⋅ = + − ⋅ = −∑ ∑

50

Here, we neglect the off-diagonal terms before inverting the information matrix ( ) 11nP −− , in

the goal to improve the complexity. In this case, we will invert a diagonal matrix and hence the time computation will significantly decrease. Therefore, the decentralized sub-optimal algorithm is given by:

( )( )( 1) ( ) ( )

10

1 1 ˆˆ ˆ ˆ ˆ( ) ( 1) ( 1) ( ) (7.4.2)1i

i

k n ki i ji i j

j N jiii

j N ji

n n O n nrP n

r

τ τ τ τ+

− ∈

∈

⎧ ⎫⎪ ⎪⎡ ⎤= − + − − −⎨ ⎬⎣ ⎦⎪ ⎪⎩ ⎭+ ⋅∑

∑

We can interpret equation (7.4.2) in a very logical manner. The new estimate is given by the sum of the previous estimate and a correction term. This correction term is composed of the latest measurement minus the estimated measurement multiplied by the measurement variance and the total is normalized by the accumulative variance. In the numerical results section, we will compare the decentralized recursive algorithm that converges to the optimal solution to the above sub-optimal scheme. The sub-optimal algorithm is summarized in Table 7.2.

51

Name: Decentralized (recursive) Sub-Optimal Algorithm Assumptions: - The inverse initial covariance matrix 1

0P − is diagonal. - n sets of measurements are available. - The matrix 1R− is the same for each set of measurements.

Goal: Compute in a decentralized manner the offsets estimates at each network node (except for the reference) in a logical sub-optimal manner.






3. Obtain the first set of relative measurements (1)ˆjiO and the associated inverse

covariances 1

jir for every ij N∈ . Compute [ ] 1 1 1(1)

i

ij N ji i

Ir p

−

∈

= +∑ .

4. Send (0)iτ to its neighbors ij N∈ . Obtain (0)j ij Nτ ∈ and keep in memory (0)iτ . At every time n that a new set of measurement arrives:

5. Compute recursively [ ] [ ]1 1 1( ) ( 1)i

i ij N ji

I n I nr

− −

∈

= − + ∑ .

6. At every iteration k , each node iΛ performs: a. Send ( )ˆ ( )k

i nτ and k to its neighbors ij N∈ . Obtain ( )ˆ ( )kj in j Nτ ∈ .

b. Compute ( 1)ˆ ( )ki nτ + from the previous quantities, using (7.4.2).

7. Send the final values of ˆ ( )i nτ to its neighbors ij N∈ . Obtain ˆ ( )j in j Nτ ∈ and keep in memory ˆ ( )i nτ .

Table 7.2. Summary of the Decentralized Sub-Optimal Algorithm. Next, we will present a decentralized non-recursive algorithm that solves the time synchronization problem in the multiple measurement update case.

52

7.5 Decentralized Non-Recursive Algorithm Let us present an alternative method for the estimation algorithm in the multiple measurement case. The objective function is the same as before with the usual assumptions. Namely, the matrices 0P and R are assumed to be diagonal and we suppose that the matrix

1R− is similar for all the measurement sets.

1 10 0 0

1( ) ( ) ( ( ) ) ( ( ) )

nT T T T

kJ x x P x x y k A x R y k A x− −

=

= − − + − −∑

In Section 5, we have obtained the following synchronous decentralized iterative clock synchronization algorithm (for a single set of measurements):

( )( 1) (1) ( ) (0)1 1 ˆˆ ˆ 2,3,...1 1 i

i

k k ii ji j

j N ji i

j N ji i

O i Nr p

r p

ττ τ+

∈

∈

⎡ ⎤= ⋅ + + =⎢ ⎥⎛ ⎞ ⎢ ⎥⎣ ⎦+⎜ ⎟⎜ ⎟⎝ ⎠

∑∑

(7.5.1)

Initialization: (0)ˆ (0) 2,3,...i i i Nτ τ= = . Here, 0k ≥ is the iteration number. Our goal is to develop a new non-recursive estimation method (batch algorithm) for the multiple measurement case. In the latter, we will define the equivalent measurement and the corresponding equivalent covariance. In fact, we wait for all the measurement sets and then, we compute the equivalent measurement and its corresponding covariance as follows (we recall that the different measurement sets have independent noises):

( )

1 ( )

1

1

1

1 ˆ( ) 1ˆ ˆ

1( )

1 1 1( )

nkji n

k ji kji jin

k

k ji

n

k ji jiji

Or k

O On

r k

nr k rr

=

=

=

=

⎧⎪⎪ = =⎪⎪⎨⎪⎪

= =⎪⎪⎩

∑∑

∑

∑

Now, we can consider that we have just a single set of measurements and we can use the decentralized clock synchronization procedure in (7.5.1).

Let us replace the term: (1)1 ˆi

jij N ji

Or∈

∑ by the above equivalent expressions and then we obtain:

( 1) ( )

( 1) ( ) ( )

1

(0)1 1 1ˆˆ ˆ1 1

(0)1 1 ˆˆ ˆ 2,3,... (7.5.2)1 1

i i

i

i

i

k k ii ji j

j N j N iji ji

j N iji

nk k k i

i ji jj N kji i

j N ji i

Opr r

pr

O n i Nr p

nr p

ττ τ

ττ τ

+

∈ ∈

∈

+

∈ =

∈

⎡ ⎤= ⋅ + +⎢ ⎥⎛ ⎞ ⎢ ⎥⎣ ⎦⎜ ⎟+⎜ ⎟⎝ ⎠

⎡ ⎤⎛ ⎞= ⋅ + ⋅ + =⎢ ⎥⎜ ⎟⎛ ⎞ ⎝ ⎠⎢ ⎥⎣ ⎦+⎜ ⎟⎜ ⎟⎝ ⎠

∑ ∑∑

∑ ∑∑

53

The equation given in (7.5.2) is an additional decentralized algorithm for estimating the offset at each network node with respect to the reference time, using a Kalman Filter framework. In fact, instead of using the first measurement in the synchronization procedure, we employ the sum of all the measurements normalized by the number of sets. One can also extend the previous procedure to the case in which the matrix 1R− is different for each set of measurements. Actually, the equivalent measurement is given by the weighting average of all the measurements pre-multiplied by their variances. In summary, we have developed several algorithms in order to compute the optimal estimated offsets in a network, including in the multiple measurement case. The first option is to apply the centralized Kalman Filter equations but in this case, each node is required to communicate with every other node and not only with its neighbors. The second option is to wait for all the sets of measurements and to apply the non-recursive algorithm we have presented in the last section. In this case, the estimates are optimal and the communication is only between neighbors but it cannot be implemented in real-time applications. The third possibility is the algorithm that we developed; it requires only local communication and gives an on-line optimal solution but we need to know the parameter n (the number of measurement sets) and to run an iterative procedure. The fourth and last option is to apply the sub-optimal algorithm (by neglecting the off-diagonal terms of the inverse covariance matrix). This solution is the simplest in the sense that it does not require an iterative algorithm and require local communication however, the solution is only sub-optimal. In the next section, we will extend our results to several interesting directions: the incorporation of a discount factor (in order to compensate for the time-invariance assumption), the addition of a process noise to the state space equations and the extension to temporary communication failures environment.

54

8. Extensions 8.1 Incorporating a Discount Factor In this case, the objective function to be minimized is given by ( n represents the number of measurement sets):

min1 10 0 0

1

ˆ( ) ( ) ( ( ) ) ( ( ) ) ( ) (8.1.1)n

n T n k T T Tn opt

k

J x x P x x y k A x R y k A x x nγ γ− − −

=

⎡ ⎤= − − + − − ⎯⎯→⎣ ⎦∑

Here, 0 1γ< < is the discount factor that gives a higher weight to the more recent measurements. In other words, this factor can compensate for the assumption that the offsets are time-invariant. An additional point of view consists of incorporating the discount factor to the state space model or to the measurement noise covariance matrix. The corresponding state space model in this case is given by:

( )2

( 1) ( )( ) 0,

( ) ( ) ( )n

T

x n x nv n N R

y n A x n v nγ

+ =⎧⎪⎨⎪ = +⎩

∼

The second alternative is to consider the standard state space model and to incorporate the discount factor in the measurement noise covariance matrix. In this case:

( )( ) 0, n

nn

v n N R

R R γ= ⋅

∼

In other words, the measurement noise decreases in time. Our goal is to find the optimal offsets. Hence, we compute the derivative of the objective function with respect to the offsets vector and set it to zero. As usual, we assume that the initial inverse covariance matrix 1

0P − is diagonal.

( ) ( ) ( ) ( )

( )

1 1 1 10

1

( )

1

( )

1

( ) 0 2,3,...

1 1 1 1ˆ (0) 0

1 1 1 ˆ

i i

i i

nn k T n

i i i iki

nn k k n

i j ji i ik j N j Nji ji i i

nn k n n k k

i jik j N j Nji i ji

J AR A x AR y k P x P x i Nx

Or r p p

Or p r

γ γ

γ τ τ γ τ τ

τ γ γ γ τ

− − − − −

=

−

= ∈ ∈

− −

= ∈ ∈

∂ ⎡ ⎤ ⎡ ⎤= − + − = =⎣ ⎦ ⎣ ⎦∂

⎡ ⎤ ⎡ ⎤− − + − =⎢ ⎥ ⎢ ⎥

⎢ ⎥ ⎣ ⎦⎣ ⎦⎡ ⎤

⋅ + ⋅ = +⎢ ⎥⎢ ⎥⎣ ⎦

∑

∑ ∑ ∑

∑ ∑ ∑ ( )1

1 (0) 0n

nj i

k ipγ τ

=

⎡ ⎤+ ⋅ =⎢ ⎥

⎢ ⎥⎣ ⎦∑

This implies:

( )( )

1

1

1 1 1ˆ( ) ( ) (0) 2,3,... (8.1.2)1 1 i

i

nn k k n

i ji j in k j N ji in k n

j N kji i

n O n i Nr p

r p

τ γ τ γ τγ γ

−

= ∈−

∈ =

⎧ ⎫⎡ ⎤⎪ ⎪= + + ⋅ =⎢ ⎥⎨ ⎬⎛ ⎞ ⎢ ⎥⎪ ⎪⎣ ⎦⎩ ⎭⋅ + ⋅⎜ ⎟⎜ ⎟⎝ ⎠

∑ ∑∑ ∑

55

Equation (8.1.2) represents the optimal offset of node iΛ given n sets of measurements. We notice that the case 1γ = coincides with the case we studied in the previous section (without discount factor). The above equation can be implemented by a synchronous iterative algorithm, as in the previous cases:

( )( 1) ( ) ( )

1

1

1 1 1ˆˆ ˆ( ) ( ) (0) (8.1.3)1 1 i

i

nk n m m k n

i ji j in m j N ji in m n

j N mji i

n O nr p

r p

τ γ τ γ τγ γ

+ −

= ∈−

∈ =

⎧ ⎫⎡ ⎤⎪ ⎪= + + ⋅⎢ ⎥⎨ ⎬⎛ ⎞ ⎢ ⎥⎪ ⎪⎣ ⎦⎩ ⎭⋅ + ⋅⎜ ⎟⎜ ⎟⎝ ⎠

∑ ∑∑ ∑

Next, we will develop a recursive version of this synchronous iterative algorithm in a similar way to the basic case. The synchronous iterative algorithm that computes the optimal offset at node iΛ given

1n − sets of measurements is given by:

( )1

( 1) 1 ( ) ( ) 1

1 11 1

1

1 1 1ˆˆ ˆ( 1) ( 1) (0)1 1 i

i

nk n m m k n


j N mji i

n O nr p

r p

τ γ τ γ τγ γ

−+ − − −

−= ∈− − −

∈ =

⎧ ⎫⎡ ⎤⎪ ⎪− = + − + ⋅⎢ ⎥⎨ ⎬⎛ ⎞ ⎢ ⎥⎪ ⎪⎣ ⎦⎩ ⎭⋅ + ⋅⎜ ⎟⎜ ⎟⎝ ⎠

∑ ∑∑ ∑

After an infinite number of iterations in all the nodes, we will obtain the following steady-state equations:

( )1

1 ( ) 1

1 11 1

1

1 1 1ˆˆ ( 1) ( 1) (0)1 1 i

i

nn m m n


j N mji i

n O nr p

r p

τ γ τ γ τγ γ

−− − −

−= ∈− − −

∈ =

⎧ ⎫⎡ ⎤⎪ ⎪− = + − + ⋅⎢ ⎥⎨ ⎬⎛ ⎞ ⎢ ⎥⎪ ⎪⎣ ⎦⎩ ⎭⋅ + ⋅⎜ ⎟⎜ ⎟⎝ ⎠

∑ ∑∑ ∑

Let us find the expression for ( 1)1 ˆ

i

nji

jij NO

r−

∈∑ from the above equation:

( )

1( 1) 1 1 1

1

21 ( )

1

(0)1 1 1ˆ ˆ ( 1)

1 1ˆ ˆ ˆ( 1) ( 1)

i i

i i

nn n m n n i

ji iji ji i ij N j N m

nn m m

ji j jji jim j N j N

O nr r p p

O n nr r

τγ γ τ γ

γ τ τ

−− − − − −

∈ ∈ =

−− −

= ∈ ∈

⎡ ⎤⎢ ⎥= ⋅ + ⋅ ⋅ − − ⋅ −⎢ ⎥⎣ ⎦

⎡ ⎤⎢ ⎥− + − − −⎢ ⎥⎣ ⎦

∑ ∑ ∑

∑ ∑ ∑

56

Now, let us replace the last expression in the equation of ( 1)ˆ ( )ki nτ + and after some simple

algebraic operations, we will obtain the following decentralized recursive algorithm (the steps are very similar to the previous case, hence are not detailed here):

[ ]( ) ( )

[ ] [ ]

1( 1) ( ) ( ) ( )

11

1 1

1 1 1ˆˆ ˆ ˆ ˆ ˆ ˆ( ) ( 1) ( 1) ( ) ( ) ( 1)( ) (8.1.4)

1( ) ( 1) 2,3,...

i i

i

nk n k n m k

i i ji i j j jj N m j Nji jii

i ij N ji

n n O n n n nr rI n

I n I n i Nr

τ τ τ τ γ τ τ

γ

−+ −

−∈ = ∈

− −

∈

⎧ ⎫⎡ ⎤⎛ ⎞⎪ ⎪⎡ ⎤= − + ⋅ ⋅ − − − + ⋅ ⋅ − −⎢ ⎥⎨ ⎬⎜ ⎟⎣ ⎦ ⎝ ⎠ ⎢ ⎥⎪ ⎪⎣ ⎦⎩ ⎭

= ⋅ − + =

∑ ∑ ∑

∑

where: [ ] [ ]1 1 1(0) (0)i ii

I Pp

− −= = .

As expected, the algorithm in (8.1.4) reduces to the previous one when 1γ = (without discount factor). The above algorithm is optimal (in the MMSE sense) given the previous model measurements, since we proved that it is equivalent to the Kalman Filter algorithm. Moreover, this algorithm is easy to implement as it requires communication only with neighbors, allowing us to implement it locally. In the equation of [ ] 1( )iI n − , the interpretation of the discount factor is clear. Since [ ] 1( )iI n − is a measure of the inverse covariance matrix of the error estimate, it is logical that the new information depends on the previous information multiplied by γ (like a forgetting factor) and on the inverse covariance of the measurements. We note that it is straightforward to obtain the above algorithm in the case where the matrix 1R− is different for each set of measurements.

57

8.2 Addition of Process White Noise Now, we will include a white noise ( )w n to the clocks readings. In this case, the state space model will be given by:

( 1) ( ) ( )( ) ( ) ( )T


+ = +⎧⎨

= +⎩ (8.2.1)

We can interpret the noise ( )w n as a measure of the unknown difference between two successive offsets or as a compensation of the uniform skew assumption and the time-invariant offsets. This is different from the incorporation of a discount factor because it gives more flexibility through the choice of the noise statistics parameters. We assume that ( )w n is modeled as a white Gaussian noise with zero mean and covariance

( ) 0Q n ≥ . Namely:

[ ] [ ]( ) 0 ( ), ( ) ( ) klE w n Cov w k w l Q k δ= = ⋅ Moreover, we assume that the process noise and the measurement noise are statistically independent:

[ ]( ), ( ) 0 ,Cov w k v l k l= ∀

In this case, the optimal estimate obtained by applying the Kalman Filter is equivalent to the constrained minimization of the deterministic objective function (see the proof in Appendix A):

( ) 1 10 0 0

min1

0

(0), (1),..., ( ) ( (0) ) ( (0) )

ˆ( ( ) ( )) ( ( ) ( ))

: ( 1) ( ) ( )

T T

nT T T

optk

J x x x n x x P x x w Q w

y k A x k R y k A x k x

subject to x n x n w n

− −

−

=

⎫= − − + +⎪⎪+ − − ⎯⎯→⎬⎪⎪+ = + ⎭

∑

We note that this is the same objective function as before, with the addition of the last term. By using the constraint, we can write this last term in the following way:

( ) ( )1 1 1

1 1

( ) ( 1) ( ) ( 1)n n

TTk k k k

k k

w Q w w Q w x k x k Q x k x k− − −

= =

= = − − − −∑ ∑ As usual, we will assume that the initial inverse covariance matrix 1

0P − is diagonal. In addition, we will assume that the covariance matrix of the process noise is diagonal and time invariant, i.e., ( ) 1,...,Q k Q k n= ∀ = . In this new case, the offsets are time varying and the optimal solution is time dependent:

0|

|

ˆ ˆ(0)

. .

. .

. .ˆ ˆ( )

opt n

opt n n

x x

x n x

⎛ ⎞ ⎛ ⎞⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟=⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠

58

The preceding method cannot be applied to obtain the optimal solution in this case; the time dependency makes the estimation problem more difficult. With the purpose of finding these optimal time varying offsets, we may compute the derivatives of the objective function with respect to the offsets vectors at time 0,1,...,n and to set them to zero:

( ) ( )

( ) ( )

1 1 1 1 1 10

1 1 1

(0) (0) (0) (0) (1) 0(0)

.

.

.

( ) ( ) ( )( )

T

T

J P x P x AR y AR A x Q x Q xx

J AR A x k AR y k Q x kx k

− − − − − −

− − −

∂= − − + + − =

∂

∂= − +

∂1 1( 1) ( )Q x k Q x k− −− − −

( ) ( )

1

1 1 1 1

( 1) 0; 1,..., 1

( ) ( ) ( ) ( 1) 0( )

T

Q x k k n

J AR A x n AR y n Q x n Q x nx n

−

− − − −

⎧⎪⎪⎪⎪⎪⎪⎨⎪⎪ + + = ∀ = −⎪⎪ ∂⎪ = − + − − =∂⎪⎩

One can observe that the solution is not simple since each vector depends on the estimates at different times. Indeed, we see that the estimate at time 0 depends on that at time 1, the estimate at time k ( 1,..., 1k n∀ = − ) depends on its adjacent estimates (at times k+1 and k-1 ), and finally the one at time n depends on that at time n-1 . Our goal is to develop a recursive relation of the following form:

( )1| 1 |ˆ ˆ , ( 1)n n n nx f x y n+ + = +

The method described above is not efficient to solve this problem. Hence, we choose to write the Kalman Filter equations. In the previous notation, we have: , TI H AΦ = = . The KF equations are therefore:

Time update (prediction): ˆ ˆ( 1| ) ( | )

( 1| ) ( | )x n n x n nP n n P n n Q

+ =⎧⎨ + = +⎩

Measurement update:

1 1

ˆ ˆ ˆ( 1| 1) ( 1| ) ( 1) ( 1) ( 1| )

( 1) ( 1| ) ( 1| ) ( 1| 1)

( 1| 1) ( 1) ( 1| )

T

T

T

x n n x n n K n y n A x n n

K n P n n A A P n n A R P n n AR

P n n I K n A P n n

− −

⎧ ⎡ ⎤+ + = + + + + − +⎣ ⎦⎪⎪ ⎡ ⎤+ = + + + = + +⎨ ⎣ ⎦⎪

⎡ ⎤+ + = − + +⎪ ⎣ ⎦⎩

59

The recursive combined Kalman Filter equations are given by:

( )

1

11 1 1 1

ˆ ˆ ˆ( 1| 1) ( | ) ( 1| 1) ( 1) ( | )

( 1| 1) ( 1| ) ( | )

T

T T

x n n x n n P n n AR y n A x n n

P n n P n n AR A P n n Q AR A

−

−− − − −

⎧ ⎡ ⎤+ + = + + + + −⎪ ⎣ ⎦⎨

+ + = + + = + +⎪⎩ (8.2.2)

We can see that if 0Q = (there is no process noise), we achieve the same result as in the basic case. As we previously mentioned, the inverse covariance matrix is not diagonal after one step (due to the addition of the term 1 TAR A− ). Combining these two equations leads to the following recursive centralized algorithm:

( )11 1 1ˆ ˆ ˆ( 1| 1) ( | ) ( | ) ( 1) ( | ) (8.2.3)T Tx n n x n n P n n Q AR A AR y n A x n n−− − −⎡ ⎤ ⎡ ⎤+ + = + + + + −⎣ ⎦⎣ ⎦

The only difference with the previous case is the presence of the covariance matrix Q . In the case without process noise, we succeeded in developing a decentralized version of this recursive centralized algorithm. However, when a process noise is incorporated to the state space model, the new structure of the inverse covariance matrix 1( 1| 1)P n n− + + does not enable the application of the same procedure. Moreover, a decentralized non-recursive (batch) algorithm is not an option, as there is a correlation between the different measurements, so we cannot compute the equivalent measurement easily. The single solution we propose is to apply the recursive centralized Kalman Filter algorithm given in (8.2.3). It will lead to an optimal on-line solution, but requires communication between all the nodes over the network (or the existence of a central unit). In summary, in the case of the presence in the system of a process white noise, we have not developed a recursive decentralized algorithm to optimally estimate the offsets at each network node. The approaches that we employed in the previous case are not applicable here, and this problem is still unsolved. On the other hand, the addition of a process noise in our model is not compulsory since the offsets and the skew can be assumed to be constant over some known time intervals. Moreover, this assumption can be compensated through the incorporation of a discount factor as explained in the previous section.

60

8.3 Faulty Communication Environment One desirable attribute of any decentralized algorithm is robustness to communication failures, such failures being unavoidable in practice. So far, we considered several algorithms that iteratively compute the optimal estimates, assuming perfect communication channels (no failures). In this section, we improve the algorithms to handle with dynamic changes in the communication topology by considering temporary link failures. We slightly modify our algorithm to become robust to temporary communication failures. Since a neighbor may become unavailable at any time, every node stores in its local memory the estimates of its neighbors' variables recorded from the last successful communication exchange. We denote by ( ) ( )ˆ k

i jτ the estimate of iΛ 's clock offset kept in jΛ 's local

memory at the end of the k -th iteration. If the last successful communication between iΛ

and jΛ took place during the l -th iteration, while l k< , then ( ) ( ) ( )ˆ ˆk li ijτ τ= .

Let ( )ki iN N⊆ be the subset of iΛ 's neighbors that send and receive data successfully

during the k -th iteration. In other words, node iΛ gets from every ( )kj iNΛ ∈ the most

recent estimates and updates its copy of its neighbors' estimates. For the rest of the neighbors, the communication fails, so the local copies remain unchanged. In mathematical notation:

( )( ) ( )

( )

( 1) ( )

ˆ ;ˆ

ˆ ; \

k ki i jk

i j k ki i j j

N

N N

ττ

τ −

⎧ ∀Λ ∈⎪= ⎨∀Λ ∈⎪⎩

(8.3.1)

Then, node iΛ computes its own estimate at the 1k + iteration by one of the algorithms developed in the previous sections. The only difference is that the estimates of the neighbors are taken according to the above formula and depend on the failures in the communication links. This extension was inspired by [3], where the authors apply this method for solving the time synchronization problem in a faulty communication environment for the WLS case ( 1

0 0P − = ). They also show its convergence to the optimal estimates when certain mild conditions on the failure rate are satisfied. Namely, they consider that there exists a positive integer p < ∞ such that the number of consecutive communication failures between every pair of neighboring nodes is less than p . One can apparently extend the result for the general Kalman Filter framework, but a convergence analysis must be investigated. In this research, we did not consider the latter analysis. Moreover, in [3] the authors propose an initialization scheme that improves the accuracy of the estimates. The corresponding details are omitted here. In summary, we propose a modification to our algorithm so it may work even in the presence of faulty communication. In fact, the WLS algorithm is proved (in [3]) to converge to the optimal solution even in the presence of link failures, whereas for the decentralized KF, such a proof was not investigated here and can be considered as a future research direction.

61

9. Clock Skew Estimation In the previous analysis, we considered a simple model where all the clocks progress at the same rate ( 1i jα α= = , i.e., there is no skew), but have arbitrary offsets. In other words, the estimation problem was reduced to the estimation of the clock offsets. In this section, we extend the results to the case of general clock skew, when the clock offset parameters are still assumed to be time-invariant and our objective is to estimate both the clock offsets and the rate offsets at each network node. Naturally, the importance of estimating the clock skew increases as the measurements (and estimates) are conducted over longer time intervals. If all the measurements are obtained simultaneously, the skew parameter is irrelevant. We still suppose that the clock drift at a node follows the linear form:

( )i i iT t tα τ= + , where iα and iτ are the skew (rate offset) and the offset parameters respectively, t is the real time (or the reference time) and iT is the local time (at node iΛ ). Our goal is to estimate the parameters iα that describe the rate of the local clocks relative to the reference clock ( 1 1α = ) in addition to the offsets iτ . If one knew the constants iα , then one could estimate iτ as in the previous sections by first dividing all the local clocks readings by iα . Thus, we must now describe how to obtain estimates of these skew values

iα , and do so without prior knowledge of the offsets iτ or to propose an algorithm that estimates both parameters simultaneously. In this section, we will divide the analysis into two different approaches. The first approach estimates the time offsets and the rate offsets simultaneously, whereas the second is based on a separate estimation of both parameters. The latter treats the clock offset and the clock skew on different time scales like the scheme in [21]. First, we will describe a procedure to estimate simultaneously both the offsets and the skews using a combined Kalman Filter algorithm. We will consider a centralized optimal algorithm and describe briefly its decentralized implementation. Second, we treat the case of separate skew and offset estimation problem. We propose three different algorithms in order to estimate the skew parameter at each node over the network with respect to the reference clock. The procedure of each method will be detailed, before comparing their characteristics with respect to one another. The first method was introduced in the literature (see [41] and [21]) and is characterized by the application of the logarithmic function on the skew parameters. The second method is mathematically equivalent to the first one, but does not require the application of the logarithm. The latter is named the multiplicative method due to its multiplicative nature. Both these methods are based on Least-Squares minimization of appropriate cost functions. The third method uses the same measurement model as the offset estimation problem and in opposition to the two other methods, is only sub-optimal. The second and the third method are to the best of our knowledge original. We note that the exposition in this chapter is relatively brief, and we avoid repeating details that are similar to previous chapters.

62

9.1 Combined Skew and Offset Estimation Our purpose is to develop a unified algorithm in order to estimate the clock parameters (both clock offset and skew) simultaneously. First, we write the state space model of our problem that includes the clock skew influence. Then, we will optimally solve the estimation problem by applying the Kalman Filter algorithm to the state space vector that contains all the clock parameters (both clock offset and skew) at each network node. In this part, we consider the same measurement model as in the offset estimation problem (see Figure 4.1). In our clock model, the local time (at node iΛ ) is given by: ( )i i iT t tα τ= + , where iα and

iτ are the skew (rate deviation) and the clock offset parameters respectively and t is the real time (or the reference time). Let us define the state vector ( )x t with its i-th element given by:

( )( ) ( ) 1i i i ix t T t t tα τ= − = − + Let us now perform a uniform discretization of the previous continuous-time equation (for simplicity, we assumed that the sampling interval is uniform and denoted by ST ):

( ) ( )( ) 1 (0) 1i i S i i S ix n T n x T nα τ α= − + = + ⋅ − (9.1.1) We define: ( )1i ib α − as a constant random bias at node iΛ and ( )1 20, ,..., T

Nb b b b= = . In other words, the bias b is equivalent to the skew parameter minus 1. In vector form, we obtain:

( ) (0) Sx n x nT b= + ⋅

or, equivalently: ( ) ( 1)(0)

Sx n x n T bx τ

= − + ⋅=

Namely, we have incorporated a constant random bias to the state space dynamical equation. Then, the state space model that includes the clock skew is given by:

( 1) ( )

( ) ( ) ( )S

T

x n x n T b

y n A x n v n

+ = + ⋅⎧⎨

= +⎩ (9.1.2)

Here, we assume that (0, )b N B∼ where B is the bias covariance matrix and is assumed to be diagonal. Moreover, ( )0x and b are assumed to be statistically independent. In brief, we showed that incorporating an additive bias to the dynamical equation of the state space model is equivalent to add a multiplicative skew parameter to the clock model. Let us define the augmented state space vector by:

( )

( )x n

X nb

⎛ ⎞= ⎜ ⎟⎝ ⎠

(9.1.3)

63

For this augmented state space vector, we will have the following state space model:

( )

( 1) ( )( 1) ( )

0

( ) 0 ( ) ( )

S

T

x n I T I x nX n X n

b I b

y n A X n v n

⎧ +⎛ ⎞ ⎛ ⎞⎛ ⎞+ = = = Φ⎪ ⎜ ⎟ ⎜ ⎟⎜ ⎟

⎝ ⎠ ⎝ ⎠⎝ ⎠⎨⎪ = +⎩

(9.1.4)

It is now possible to apply the standard KF equations in order to obtain a centralized optimal estimate of both skew b and offset ( )x n . Our objective is to estimate the entire augmented vector given in (9.1.3) (that is, the offset and the skew at each node over the network) in an optimal way. In addition, we will attempt to develop a decentralized version that hopefully converges to the optimal centralized solution. Remark

An alternative easier method is to estimate the vector (0)xb

⎛ ⎞⎜ ⎟⎝ ⎠

. This is exactly equivalent

because if one knows the vector (0)xb

⎛ ⎞⎜ ⎟⎝ ⎠

, one can easily compute the general vector ( )x nb

⎛ ⎞⎜ ⎟⎝ ⎠

using the relation:

( ) (0) Sx n x nT b= + ⋅ a) Centralized Kalman Filter Algorithm In order to find the optimal parameters, we define the corresponding objective function (Least-Squares approach) and we set its gradients with respect to τ and b to zero. An additional alternative is to write the Kalman Filter equations for the augmented state space vector. Then, we will obtain a vector equation for estimating the optimal offsets together with the optimal skew parameters at each node over the network. The details are hereby omitted for the simple reason that the procedure is very similar to the basic case, and the mathematical manipulations are of no particular interest. b) Decentralized Implementation We briefly discuss the decentralized implementation of the optimal centralized solution. Proceeding similarly to Section 5, it is possible to develop a decentralized Jacobi-like iteration for this problem. Unfortunately, this algorithm generally diverges (the spectral radius of the iteration matrix is bigger than 1). Actually, we obtained a decentralized algorithm that allows us to estimate both the skew and the offset and we have shown (by a simple numerical example) that this algorithm does not converge. We omit the details here, as they do not contain any insights. As a future research direction, it can be interesting to check if the decentralized iterative algorithm converges to the optimal centralized solution in the case of a reduced step size. Namely, if instead of the Jacobi iterative procedure, we try to employ a relaxed Jacobi algorithm (or equivalently, the gradient method) with small step size (see Section 2.1).

64

As it stands, the problem is better solved by estimating the offsets and the skews simultaneously by the centralized optimal algorithm. The main advantage of this approach relies in that it will lead to the optimal solution; however its central drawback is that each node has to communicate with every other node. So far, a decentralized optimal algorithm for the combined estimation problem was not obtained. Therefore, we may consider the separate skew and offset estimation for which several decentralized methods are proposed in the next section.

65

9.2 Separate Skew and Offset Estimation In this section, we propose to solve the time synchronization problem by estimating separately the clock offset and the clock skew (rate offset) at each network node. The clock offset estimation problem can be solved by one of the preceding algorithms (see sections 5 and 7 for the single measurement and the multiple measurement cases respectively). Three different methods are presented in order to solve the clock skew estimation problem. a) The Logarithmic Method

In the subsequent analysis, the measurements are obtained in a different way than in the offset estimation problem. Namely, each node is sending a pair of probe packets located at significant time intervals (i.e., large compared to the variances of the individual measurements) to each one of its neighbors. Figure 9.1 depicts the situation for the pair of neighboring nodes iΛ and jΛ . Time is stamped on the packets 1k and 2k by the sender iΛ upon transmissions ( 1 2 1( ), ( ) ( )i i iT k T k T k ) and by the receiver jΛ upon receiving the packets ( 1 2( ), ( )j jR k R k ). Indeed, in order to obtain an accurate estimate of the skew parameter, we need to take a pair of probe packets well spaced in time. In other words, iα can be regarded as the slope of the local time as a function of the reference time.

Figure 9.1. Communication between two neighboring nodes for the skew estimation problem.

Let mt denote the transmission time of packet 1, 2mk m = according to the reference time. Then, up to the time-stamping error (assuming that at time 0t = , the offset was iτ ):

( )i m i m iT k tα τ= + Similarly, let mt denote the received time of packet 1, 2mk m = . Then:

( )( )

( ) ( )

m m ij m

j m j m ij m j

t t x k

R k t x kα τ

= +

= + +

iΛ jΛ

2 1( ) ( )i iT k T k

1( )iT k

2( )jR k

1( )jR k

66

Here, ( )ij mx k is the propagation delay of the packet mk over the corresponding link. In the

above equations, both ( )1 2 1 2, , ,t t t t and ( ),i jτ τ are unknown, in addition to the skew

parameters of interest ( ),i jα α . We now manipulate our measurements so that these extra unknowns are cancelled. Let jRΔ denote the time difference between the reception of probe packet 2k by node jΛ and the receiving time of packet 1k at node jΛ according to jΛ ’s clock, namely:

2 1( ) ( )j j jR R k R kΔ = − Let iTΔ denote the time difference between the transmission of probe packet 2k by node iΛ and the transmission time of packet 1k at node iΛ according to iΛ ’s clock, namely:

2 1( ) ( )i i iT T k T kΔ = −

If we divide iTΔ by jRΔ , we obtain an estimate of the relative skew iji

j

αα

α= (assuming

that the transmission delay and the offsets remain constant for the different probe packets ):

22 1

2 1

( ) ( )( ) ( )

i ii i i

j j j

tT T k T kR R k R k

α τ+Δ −= ≅

Δ −( ) 1i itα τ− +( )

2j jtα τ+( ) 1j jtα τ− +( )i

j

αα

= (9.2.1)

It is natural to divide the above quantities in order to cancel the influence of both the offsets

and the time interval. If we apply the logarithm function to the quantity i

j

TRΔΔ

, we will

obtain a relative measurement of the skew logarithm:

( ) ( )log log logiji i j

j

TzR

α αΔ≅ −

Δ (9.2.2)

The term jiz in (9.2.2) corresponds to the measurement of the node pair iΛ and jΛ . Then, one can employ the previous methodology of Section 5 in order to estimate the skew logarithm at each node. In other words, if we substitute jiz for ˆ

jiO and ( )logi iβ α for

iτ , we obtain the same mathematical problem as the previous offset estimation problem. For example, let us develop the optimal decentralized algorithm for the basic statistical LS case, namely: 1R R I−= = and 1

0 0P − = . The state vector and the objective function to be minimized are given by:

( ) ( ) ( )( )1 1 2 2( ) log 0, log ,... logT

N Nx n β α β α β α= = = = (9.2.3)

( )2

,( ) ( )

i

T T Tji i j

i jj N

J y A x y A x z β β∈

= − − = − +∑

67

In order to minimize J , we may compute the partial derivatives with respect to iβ and set them to zero (the procedure is very similar to the one performed in Section 5):

( ) ( )

( )

2

2 1 0

i

i i

Ti ji i ji

j Ni

i ji jj N j N

J AA x A y z

z

β ββ

β β

∈

∈ ∈

∂= − = − − + =

∂

⎡ ⎤= − − ⋅ + + =⎢ ⎥

⎣ ⎦

∑

∑ ∑

( )

i

i

Ti i ji

j N

i jij N

AA x N

A y z

β β∈

∈

⎧ = −⎪⎨

=⎪⎩

∑

∑

Substituting into the partial derivatives leads to:

( ) 0i

i i ji jj Ni

J N zβ ββ ∈

∂= − + =

∂ ∑

From this, we can as usual employ an iterative (synchronous) algorithm in order to implement the above optimal equation:

( )( 1) ( )1

i

k k

i ji jj Ni

zN

β β+

∈

= +∑ (9.2.4)

Initialization: ( )

(0)log (0) 2,3,...i i i Nβ α= = . Here, 0k ≥ is the iteration number.

The procedure in (9.2.4) represents the decentralized synchronous algorithm that implements the optimal equation in order to estimate the rate offset logarithm at each network node given a single pair of measurements. One can easily generalize for the general Kalman Filter framework in a similar manner. The procedure is exactly the same; we have just to perform the following substitutions:

( )

ˆ log

log

iji ji

j

i i i

TO zR

τ β α

⎧ ⎛ ⎞Δ→⎪ ⎜ ⎟⎪ ⎜ ⎟Δ⎨ ⎝ ⎠

⎪→ =⎪⎩

(9.2.5)

After applying the above procedure, one can optimally estimate the rate offset logarithm at each node over the network with respect to the reference node. Consequently, we have shown that the clock skew estimation problem reduces to the same mathematical setup as the offset estimation problem under the appropriate substitutions in (9.2.5). Next, we propose an alternative method in order to estimate the skew parameters without requiring the application of the logarithm function on the measurement sets.

68

b) The Multiplicative Method Now, we will develop an additional method to estimate the skew parameters without requiring the application of the logarithm function on the sets of measurements. As we previously explained, the estimation of the skew values iα has to be done without the prior knowledge of the offsets iτ . We note that the only way to get rid of the dependence of the real time t is to divide iTΔ by jRΔ :

2 1

2 1

( ) ( )( ) ( )

i i i iji

j j j j

T T k T kR R k R k

αςα

Δ −= ≅

Δ − (9.2.6)

The term jiς in (9.2.6) corresponds to the measurement of the node pair iΛ and jΛ . In this case, we do not apply the logarithm function but instead we define the following objective function which is to be minimized:

( )2

, i

iji

ji j NJ αα ς

α∈

⎛ ⎞= −⎜ ⎟⎜ ⎟

⎝ ⎠∑

We considered here the basic LS case, but one can easily generalize to the WLS case. Let us now differentiate ( )J α according to kα and set the partial derivatives to zero:

2k

Jα∂

= −∂

1 2k

kjk

j jj N

ας

α α∈

⎛ ⎞− +⎜ ⎟⎜ ⎟

⎝ ⎠∑

( )20

1

k

i iki

kj N k

j

α ας

αα

α

∈

⎛ ⎞− =⎜ ⎟

⎝ ⎠

−

∑

( ) 2i

i iji

jj N j

α ας

α α∈

⎛ ⎞− +⎜ ⎟⎜ ⎟

⎝ ⎠∑

( ) ( )

0

1 0

0 0

i

i

i

iji

jj N

i iji

j jj N

i j i j jij N

ας

α

α αςα α

α α α α ς

∈

∈

∈

⎛ ⎞− =⎜ ⎟⎜ ⎟

⎝ ⎠

⎛ ⎞⎛ ⎞− − =⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟

⎝ ⎠⎝ ⎠

≠ ≠ − ⋅ =

∑

∑

∑

From this, we can employ an iterative (synchronous) algorithm in order to implement the above optimal equation:

( 1) ( )1ˆ ˆi

k kjii j

i j NNα ς α+

∈= ⋅∑ (9.2.7)

We can note that the synchronous decentralized algorithm in (9.2.7) has exactly the same form as the previous algorithm with the logarithm function in (9.2.4). The difference is that in this case, the update is multiplicative and in the other case the update was additive.

69

The main drawback of the two previous methods is that the measurements for the offset estimation and for the skew estimation are not similar. Namely, the measurement model for the offset estimation problem is composed of a fast bilateral exchange (see Figure 4.1). In the measurement model of the skew estimation problem, node iΛ sends a pair of probe packets located at significant time intervals to node jΛ (see Figure 9.1). Since we are interested in performing both the offset and the skew synchronizations, it is valuable to develop an algorithm that employs the same measurement format in the different procedures. Next, we propose an additional sub-optimal method for skew estimation that requires the same measurements format as the offset estimation problem. c) State-Space based Solution Now, we consider the same measurement model as in the offset estimation problem (see Figure 4.1). For this measurement model, it was previously shown that the measurement equation of the state space model is given by:

( ) ( ) ( )Ty n A x n v n= + The entries of the vector ( )x n are defined by the offsets at each node at time t :

( )( ) ( ) 1i i i ix n T n n nα τ= − = − + Here, 0n ≥ is the discrete time index and ( )y n is the measurement set of every pair of neighboring nodes at time n . We note that n need not refer to the actual time, but rather corresponds to the epoch when the n-th measurement set ( )y n become available. The initial state of the system (0)x has the following first and second order statistics: [ ] [ ]0 0(0) cov (0)E x x x P= = . { }( )v n is the measurement noise modeled as a white noise with

zero mean and covariance ( ) 0R n R= > . We assume that { }( )v n is uncorrelated and therefore the matrix R is diagonal and Positive Semi-Definite (PSD). Its i-j element corresponds to the pair of neighboring nodes iΛ and jΛ :

( ) jiijR r=

{ }( ) , (0)v n x are uncorrelated, that is: (0) ( ) 0TE x v n n⎡ ⎤ = ∀⎣ ⎦ .

Let us denote:

( )( )1 2

1

0, ,...,i i

TN

b

b b b b

α −

= =

Then:

( )( )( ) ( )T

x n b ny n A b n v n

τ

τ

= ⋅ +

= ⋅ + +

70

We consider the multiple measurement case and we propose a decentralized sub-optimal algorithm that estimates the offset and the skew separately. The offsets are estimated after each set of measurements in a recursive way (for more details, see Section 7), and the skew parameters are estimated after bn T= sets of measurements only (according to the pair of farthest measurements). This is a sub-optimal scheme since the estimate of b is not performed in accordance with all the sets of measurements, and we are not taking into account the dependence between τ and b . We would like to emphasize the fact that this algorithm is a heuristic method (non-optimal) in opposition to the majority of the previous algorithms that were optimal in the sense that they achieve the minimal value of an objective function. We will now present in more details this decentralized algorithm that estimates the skew parameter at each node over the network. As we explained in the beginning of the present section, it is logical to estimate the skew parameter according to a pair of measurements well spaced in time (because it is like a slope estimation problem), whereas the offsets can be estimated after each set of measurements. Let us compute the difference between ( )by T

and ( )0y (assuming that the offsets iτ are time-invariant):

( ) ( ) ( )( ) ( )

( ) ( ) ( ) ( )0 0

0 0(9.2.8)

Tb b b

T

b bT

b b

y T A b T v T

y A v

y T y v T vA b

T T

τ

τ

⎧ = ⋅ + +⎪⎨

= +⎪⎩− −

= ⋅ +

One can easily note that we have obtained in (9.2.8) an equation that is similar to the measurement equation of the offset estimation problem (see Section 4.4). The only difference is that the measurements and the noises are divided by the number of measurement sets bT . The consequence is that now, the covariance matrix of the noise will

be equal to ( )2

2

bR

T⋅ , i.e., the measurement noise distribution is given by:

( ) ( )

( )20 20,b

b b

v T vv N R

T T

⎛ ⎞− ⎜ ⎟=⎜ ⎟⎝ ⎠

∼ .

In other words, we showed that the skew estimation problem considered here reduces to the basic offset estimation problem of Section 5 under the following substitutions:

( ) ( )0b

b

x by T y

yT

v v

→⎧⎪ −⎪ →⎨⎪⎪ →⎩

(9.2.9)

71

For example, the synchronous decentralized algorithm that estimates ib in the most basic case (Least-Squares estimation) is given by:

( )( )( 1) (0) ( )1ˆ ˆˆ ˆb

i

Tk ki ji ji b j

j Nb i

b O O T bT N

+

∈

= − + ⋅⋅ ∑ (9.2.10)

In the general case (DKF), we will obtain the following synchronization procedure:

( )( )( 1) (0) ( ) (0)1 1ˆ ˆˆ ˆ (9.2.11)21 1

2

b

i

i

Tk k ii ji ji b j b

j N ji ib

j N ji i

bb O O T b Tr B

Tr B

+

∈

∈

⎡ ⎤= − + ⋅ + ⋅⎢ ⎥

⎛ ⎞ ⎢ ⎥⎣ ⎦⋅ +⎜ ⎟⎜ ⎟⎝ ⎠

∑∑

Here, we assume as usual that (0, )b N B∼ where B is the bias covariance matrix and is assumed to be diagonal. In practice, we will estimate the offsets according to one of the previous developed algorithms in a recursive way (after each set of measurements). After that enough sets of measurements, say bT arrived, one can estimate ib at each node over the network

according to one of the previous algorithms. While we estimate ib at each node over the network, we can easily compute the skew parameter using the relation:

ˆˆ 1i ibα = + . Then, we will assume that the skew parameter remains constant during a known constant time bτ , so we have just to normalize the measurements by the estimated skew:

,ˆ ˆi i

i i

T Rα α

.

In order to determine the correct value of the constant time bτ , we have to know how much time the skew can be assumed to stay constant in our model. This can be done according to the literature on how to model a clock and is beyond the scope of this research. After bτ

time units, one can estimate once more the new parameters ib at each node over the network according to the first and the last measurements of this new time period. In other words, after each set of measurements we will estimate ( )ˆ 2,...,i i Nτ = , whereas

( )ˆ 2,...,ib i N= is estimated according to ( )by T and ( )0y at the first cycle, ( )2 by T and

( )1by T + at the second cycle, etc. Remark: The number bT can be different for each node namely; each node can update the skew of its own clock at his most appropriate time. This approach is similar to the scheme in [21], where the authors treat skew synchronization and offset synchronization on different time scales. That is, the parameters

iα are adjusted every skewτ time units, whereas the parameters iτ are adjusted every

offsetτ time units, with:

skew offsetτ τ .

72

So far, we have obtained several decentralized algorithms for estimating the skew parameters b with no dependence on the offsets τ . In summary, we have developed several decentralized algorithms for clock synchronization that can deal with general clocks, including both offsets and skews. The first option is to estimate the clock offset and skew simultaneously whereas the second alternative estimates them separately. Concerning the combined estimation problem, only a brief description was considered and no concrete results were presented. For the separate estimation method, three algorithms were proposed in order to estimate the skew parameter. The sub-optimal solution is the only one that uses the same measurement format as that of the offset estimation procedure. Nevertheless, the parameter bT must be known, and the result is only sub-optimal considering that only two sets of measurements are used. In the two other optimal methods, there is no such requirement on bT (because 2 1t t− is cancelled), yet the algorithms are based on measurements that are not similar to the offset estimation problem. In the next section, we present simulation results over several network topologies for evaluating and comparing the accuracy of the proposed time synchronization schemes.

73

10. Numerical Results In this section, we implement some of the algorithms that we previously developed for typical problems and we compare the results with the existing algorithms. First of all, let us describe in a concise way the different algorithms that we chose to implement. 10.1 Algorithms Description CTP: (“Classless Time Protocol” [14]). This algorithm computes each offset by calculating the average of the relative offsets of all the adjacent nodes and is equivalent to performing the Least-Squares statistical method. For example, if one node has two neighbors with relative offsets of (+1) and (–2) respectively, the node adjusts its own clock by:

1 2 0.52

+ −= − . In [14], the authors used a measurement filter in order to obtain less noisy

measurements (low queuing delay). The procedure is repeated at each node and for each set of measurements until convergence. In simulations, both the centralized and the decentralized versions are considered. WLS: Similar to CTP, but now the offsets are calculated using the Weighted Least-Squares method. Namely, each offset is estimated as the average over the neighbors' relative measurements, but each measurement is multiplied by a weight according to its accuracy. For example, in the simulations we made the logical assumption that the links with a smaller queuing delay (i.e., there is a light load) have a smaller variance in their measurements. As a consequence, links with small queuing delays are associated to a bigger weight when evaluating the offsets. DKF: An additional decentralized algorithm based on the Kalman Filter framework. This algorithm is related to the following state space equations:

( 1) ( )( ) ( ) ( )T

x n x ny n A x n v n

+ =⎧⎨

= +⎩

Here, ( )1 2( ) 0, ,... T

Nx n τ τ τ= ( iτ is the offset of the node iΛ ), ( )y n is the measurement at time n and A is the reduced incidence matrix (for more details, see the problem formulation and the scientific background sections). Moreover, we assume the following usual assumptions: - (0)x is the initial Gaussian state of the system with the following first and second order statistics:

[ ] [ ] ( )( ) 0(0) (0) cov (0) (0) (0) (0) (0) (0)Tx x x xE x m x E x m x m P P⎡ ⎤= = − − = =⎣ ⎦

- { }( )v n is the measurement white Gaussian noise with zero mean and covariance

( ) 0R n R= > . - { }( ), (0)v n x are uncorrelated for any n .

74

As we previously show, the decentralized Kalman Filter algorithm that we developed converges to the optimal centralized solution under the appropriate conditions. We chose to implement the DKF algorithm in a synchronous way, with only one set of measurements, but we will examine several different values of the covariance matrices 0P and R . CKF and CLS: The centralized Kalman Filter and the centralized Least-Squares algorithms in their standard form. For more details, see Section 5.2. SOA: The sub-optimal algorithm that neglects the off-diagonal terms of the inverse covariance matrix. For more details, see Section 7.4. As we previously mentioned, NTP is the most widely accepted standard for synchronizing clocks over the internet [28-30]. The three following algorithms are different hierarchical versions of the Network Time Protocol (NTP) and are used in this section as a benchmark. These three NTP-based hierarchical schemes were inspired by [14]. NTP-1: In this first scheme, each node arbitrarily selects a single neighbor which is one hop closer to the reference node than itself. Node iΛ adjusted his clock by:

2ij ji

i

T Tτ

Δ −Δ= .

We start with nodes that are one hop away from the reference node, move to nodes that are two hops away from it, etc. NTP-2: This second scheme is similar to NTP-1, but in this case ijTΔ and jiTΔ are selected separately on each directed link. In other words, it tends to find the smallest possible queuing delays for each link. For example, if the queuing delays from node iΛ to node kΛ are (2, 3, 6, 5, 3, 4, 5, 6) and back from kΛ to iΛ (6, 5, 4, 6, 4, 5, 3, 7), it will select the minimal value on each direction separately, i.e., (2, 3) and calculate the offset based on this modified measurement. NTP-3: This third scheme is a multi-parent scheme. Each node computes its clock offsets using the average of all its neighbors which are one hop closer to the reference node than itself. This can be interpreted as the CTP algorithm where only the parent nodes are used for calculations. 10.2 Network Topologies In order to evaluate the results of the different algorithms, we need first to construct the network topology setup. The network topology we choose to implement is based on the random model of [47]. We start with a single root node (also called the “Reference Time Node") and restrict the hop distance of each node to the root to be at most a certain number of hops. The connectivity between the nodes in the network is randomly selected. The propagation delay of each edge is chosen once for both directions (assumed to be symmetric) of any existing edge based on the uniform distribution [ ]0,10U . The queuing delay of each edge is chosen as Erlang (or Gamma) distribution where the number of

75

exponentials (α ) and the mean time between events (θ ) are randomly selected [ ]1,10U

and [ ]0.1,1U respectively. The parameters are sampled once for each edge. The clock offsets with respect to the “Reference Time Node” are randomly selected based on a uniform distribution [ ]10,10U − . The offset of the reference node ( 1Λ ) is set to zero since it is assumed to be synchronized with the UTC. As suggested by NTP in [30], eight round-trip packets are transmitted over each edge and

ijTΔ are measured based on these packets. Then, we have to pick up the best measurement among the eight (the one with the smallest transmission delay). To be compliant with the NTP message format, we suggest using four time stamps in each bilateral transmission. We constrained each node to have at least one neighbor. In this numerical example, we naturally assumed that all the clocks run at the same speed, i.e., 1, ,i j i jα α= = ∀ and that the offsets are time invariant. In all the subsequent simulations, we consider a general network as depicted in Figure 10.1, where internal loops are allowed. This is important because the Least-Squares approach improves the estimates by imposing the global constraints for all the loops in the multihop network.

Figure 10.1. A general network with internal loops.

Three different networks are considered:

• Network 1: a 400 node network with relatively high connectivity of 1798 edges.

• Network 2: a 400 node network with 997 edges.

• Network 3: a 170 node network with 1200 edges.

76

10.3 Graphical Results

In this section, we compare CTP to the WLS and to the DKF algorithms in several interesting cases. Finally, we perform the recursive centralized Kalman Filter (with 50 measurement sets) and compare its performance to the centralized Least-Squares method and to the Sub-Optimal Algorithm (SOA) that neglects the off-diagonal terms of the inverse covariance matrix.

In Appendix C, one can find a performance comparison between the CTP algorithm and three hierarchical versions of NTP, and a convergence analysis of the decentralized version of CTP (see Figures C.1 to C.4). All those simulation results are similar to the work performed by O. Gurewitz et al. in [14] and were repeated in order to constitute the starting point of the subsequent results. The next step consists of implementing the Weighted Least-Squares algorithm in a decentralized manner and to compare the results to the decentralized CTP algorithm. In the WLS algorithm, we decide to take several different values of the weighting matrix R . Then, we will model the queuing delay according to this weighting matrix. For example, we can choose the queuing delay according to the Gamma distribution where the number of exponentials (α ) is equal to the upper integer value of R and the mean time between events (θ ) is equal to R . The other option is to take a normal distribution with zero mean and covariance matrix equal to the matrix R .

Figure 10.2 presents the comparison between the decentralized CTP and WLS algorithms for the case where the weighting matrix R is randomly distributed [ ]0.1,12U . The network topology is the same as before (400 nodes with 997 edges) and the queuing delay is randomly chosen according to the Gamma distribution relative to R .

77

0 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Absolute Clock offset

Frac

tion

of n

odes

WLS Vs. Distributed CTP 400 Nodes

Distributed CTPWLS CTP

Figure 10.2. Comparison between the decentralized CTP and WLS algorithms in Network 2.

As expected, the WLS method outperforms the CTP (or LS) algorithm. The reason is that in the WLS version, each measurement is multiplied by a weight according to its accuracy that depends on the queuing delay. Since the weights are equal to the variance of the queuing delay; it outperforms the case where the weights are identically equal to one. This scenario is not very realistic and is more theoretic, because in practice we do not know the exact covariance matrix of the delay.

Unsurprisingly, if the weighting matrix R is taken as the identity matrix, we return to the Least-Squares case, equivalent as it is to the CTP algorithm. As we can see in the next figure, the graphs perfectly coincide.

78

0 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion

of N

odes



Figure 10.3. Comparison between the decentralized CTP and WLS algorithms in Network 2, with R I= .

The next scenario corresponds to the case where half the nodes have a certain value ijr and the remaining have twice that value, i.e., half the nodes are smarter than the others. For example, we chose the values 4.5 and 9. In this case too, the WLS method is more accurate than the CTP algorithm as we can see in the next figure.

We note that in all the relevant figures, the curve describing the CTP algorithm is not exactly the same. Indeed, it slightly depends on the noise realization and each figure was plotted for a random realization. However, this fact does not compromise the comparison between the different algorithms. No matter what the noise realization is, the insight presented by the results remains correct.

79

0 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Absolute value clock offset

Frac

tion

of n

odes



Figure 10.4. Comparison between the decentralized CTP and WLS algorithms in Network 2, with different ijr .

As previously pointed out, we cannot know exactly the covariance matrix R . Hence, we check several cases, where instead of R , we may employ as an example 2R or the matrix R plus an additive Gaussian Noise (with different variances). In other words, the matrix R is not known exactly, but we can use an approximation.

Figure 10.5 shows that the WLS method with the exact R gives the best results and the CTP algorithm (with R I= ) achieves the worst results. Interestingly, the WLS algorithm with 2R gives intermediate results (between LS and WLS). We can infer that even if the exact R is not employed, we can still outperform the results of the basic CTP protocol by using a good approximation of the matrix R . We will see in Figure 10.10 that if the approximation is not quite accurate, the regular LS outperforms the WLS algorithm.

80

0 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion

of n

odes

WLS Vs. WLS2 Vs. Distributed CTP 400 Nodes


WLS2 CTP

Figure 10.5. Comparison between the decentralized CTP and WLS algorithms (with R and 2R ) in Network 2.

Next, let us analyze the robustness of the matrix R in the WLS algorithm. Since the exact R is not known, we implement the WLS algorithm with a weighting matrix equal to R plus an additive Gaussian noise with zero mean and two different standard deviations. We chose the following model:

1ij

ij

nr+ .

Here, ijn is a Gaussian noise with zero mean and standard deviation equal to 0.05 in the first case and to 0.2 in the second. The results are summarized in the next figure.

81

0 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion

of n

odes



Figure 10.6a. Small noise variance

0 2 4 6 8 10 120

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion

of n

odes



Figure 10.6b. Higher noise variance

Figure 10.6. Comparison between the decentralized CTP and WLS algorithms (with additive Gaussian noises in R ) in Network 2.

As we can see from the previous figure, the results depend on the noise intensity. In the first case (Figure 10.6a), the variance of the Gaussian noise is relatively small and the WLS method outperforms the CTP algorithm. In the second case (Figure 10.6b), the variance noise is increased and as a consequence, the WLS is not anymore better than CTP. Therefore, we partially investigate the robustness properties of the exact covariance matrix R , and obtain as expected that if the intensity of the additive Gaussian noise is too high, the WLS method is not appropriate, whereas when the noise variance is relatively low, the WLS algorithm gives better results than CTP.

82

The next part of our numerical analysis is dedicated to the DKF algorithm. In the latter, we can incorporate some a-priori knowledge of the initial offsets and we compare it to the decentralized CTP algorithm. In Section 6, we showed that the DKF algorithm converges to the optimal solution obtained by performing the centralized Kalman Filter. In addition, the DKF algorithm has the same mathematical structure as the decentralized CTP with the incorporation of a supplementary term related to the a-priori knowledge. First, we check the DKF algorithm in the case where the initial covariance matrix 0P is an infinite diagonal matrix and R is equal to the identity matrix. As expected, this case reduces to the basic LS case (CTP algorithm) and the graphs are plotted in Figure 10.7.

0 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion

of n

odes

DKF Vs. Distributed CTP 400 Nodes

Distributed CTPDKF

Figure 10.7. Comparison between the decentralized CTP and DKF algorithms (with ( )1

0P diag− → ∞ and R I= ) in Network 2.

83

The next scenario considered is related to the case where half the network nodes are smart (i.e., small initial variances in the main diagonal of 0P ), and the remaining are unintelligent (i.e., bigger initial variances in the main diagonal of 0P ). In all the cases, the initial covariance matrix 0P is assumed to be diagonal and the initial offsets vector 0x is supposed to be equal to zero. According to the Kalman Filter requirements, the initial offsets have a Gaussian distribution:

( ) ( )0 00 ,x N x P∼ .

We present the results in Figure 10.8 in the case where R I= and 0P is given by:

( )( )( )0

0.01,0.19 ;

5,10 ;ii

U half the nodesP

U the remainder

⎧⎪= ⎨⎪⎩

∼∼

.

The case where the matrix R takes different values is straightforward and the results are not presented here because we want to focus on the influence of the a-priori knowledge.

0 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion

of n

odes


Distributed CTPDKF

Figure 10.8. Comparison between the decentralized CTP and DKF algorithms (with different ( )0 ii

P and R I= ) in Network 2.

84

The last case we analyze in the DKF context is the one where 10% of the nodes are perfectly synchronized to the UTC (through a GPS for example), and the remainder is not synchronized at all. Namely, for these arbitrary 40 nodes we take the initial variances very small (0.01) and the offsets equal to zero, and for the rest of the nodes, the variances tend to infinity and the offsets are randomly chosen according to a uniform distribution. The graphical comparison between the decentralized CTP algorithm and DKF is presented in Figure 10.9. As expected in this case too, the DKF algorithm outperforms the decentralized CTP method in terms of clock accuracy.

0 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion

of n

odes


Distributed CTPDKF

Figure 10.9. Comparison between the decentralized CTP and DKF algorithms (with 10% of nodes synchronized via GPS) in Network 2.

In all the previous simulations, the results we obtained agree with our expectations. Namely, the CTP algorithm is more accurate than all the NTP schemes and the corresponding decentralized version converges to the optimal centralized solution (see the proof in [14]). Moreover, the WLS algorithm is more accurate than the basic CTP (LS) method and the DKF outperforms the CTP algorithm in some suitable scenarios. These

85

results are expected from the theory because in the WLS algorithm, we provide the best weighting factors to the measurements and in the DKF method, some a-priori knowledge is included in the right manner. Hence, in several appropriate scenarios (if some a-priori knowledge is available or the variance of the measurements can be approximated), the DKF and the WLS algorithms are preferable. As expected from the theory, this typical application shows that all the algorithms give satisfactory results. In many cases, the Kalman Filter is the best algorithm, followed by the Weighted Least-Squares and the regular Least-Squares methods and the NTP based protocols.

In this section, we are not comparing the convergence rate of the different algorithms and this can be also an important factor in the choice of the appropriate protocol.

The last part of this section is devoted to the comparison of the recursive Centralized Kalman Filter (CKF) algorithm to the Sub-Optimal Algorithm (SOA) that neglects the off-diagonal terms of the inverse covariance matrix (for more details, see Section 7.4). The latter is only a sub-optimal procedure and there is no need to know the parameter n . We check several values of n (the number of measurement sets) and 0P (the initial covariance matrix). In this part, we consider the topology of Network 3 (170 nodes with 1200 edges). Our objective is to compare the accuracy of the estimated offsets obtained by both the optimal and sub-optimal algorithms and to compare the variances.

In this case, the queuing delay is randomized in accordance with the Kalman Filter assumptions, namely normally distributed with zero mean and covariance matrix R . The first situation we considered is the basic case where 0P R I= = . The following figure presents the results obtained by applying both the optimal CKF method and the SOA for different values of n . As expected, in all the cases the optimal algorithm gives the best results. The sub-optimal algorithm gives relatively poor results but reduces the complexity and is not diverging. If the important criterion is the accuracy of the clock synchronization, it is obvious that the optimal CKF is preferable, but if the accuracy is less important than the complexity and the computation time, the sub-optimal algorithm can be valuable. We are also interested in comparing the variances of the different algorithms. We compute the following expression:

( ) ( )

( )CKF SOAii ii

CKF ii

P PP−

(10.3.1)

at each node over the network and present the results in a graphical form in Figure 10.11. As we can note from the figure, the variances of the CKF method are always bigger than the variances of the SOA algorithm and the normalized deviation is quite elevated (from 45% to 75 %). In other words, the sub-optimal algorithm underestimates the variances and consequently is not a good method. Since the variances are small, it will give a wrong estimation and will not correct the results. In brief, we can infer that the simple sub-optimal algorithm does not solve the problem efficiently and it is preferable to employ the Kalman Filter algorithm.

86

1n =

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion of nod

es

CKF Vs. SOA 170 Nodes

CKFSOA

10n =

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion of nod

es


CKFSOA

30n =

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion of nod

es


CKFSOA

50n =

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion of nod

es


CKFSOA

Figure 10.10. Comparison between CKF and SOA (with 0P R I= = ) in Network 3 for different values of n .

87

0 20 40 60 80 100 120 140 160 1800.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

[ (P

CK

F) ii - (P

SO

A) ii ]

/ (P

CK

F) ii

Node number

Variance Comparison in a 170 nodes network

Figure 10.11. Variance comparison between CKF and SOA (with 0P R I= = ) in Network 3.

The next step extends the preceding analysis to a more general framework where the measurement covariance matrix R is uniformly distributed and the queuing delays are normally distributed with zero mean and covariance matrix R . In other words:

[ ]( )

0.01,12

0,delay

R U

Q N R

∼

∼

In addition, we consider that 10% of the nodes are perfectly synchronized to the UTC (through a GPS for example), and the remainder is not synchronized at all. Namely, for these arbitrary 17 nodes we will take the initial variances very small (0.01) and the offsets equal to zero, and for the rest of the nodes, the variances tend to infinity and the offsets are randomly chosen according to a uniform distribution. In this analysis, we also compare the results to the Centralized Least-Squares (CLS) algorithm. In fact, as we previously mentioned, the SOA gives relatively poor results in comparison to the CKF algorithm. We thus want to determine if the basic LS algorithm is more accurate than SOA. In other words, is it preferable to totally neglect the a-priori knowledge (i.e., take 1

0 0P − = ) or to consider this a-priori knowledge and then to neglect a part of its influence?

88

Figure 10.12 presents the results for the offsets obtained by applying the optimal CKF method, the SOA and the CLS algorithms for the same different values of n as in the previous case. Figure 10.13 compares the variances between CKF and SOA. We can draw the same conclusions as in the previous case, with the exception of a normalized variance dispersion between 4% to 78%. Moreover, we obtained that the sub-optimal algorithm is even worse (in terms of clock accuracy) than the basic centralized Least-Squares method (that does not take into account the initial covariance matrix). From the two subsequent figures, we may conclude that despite the fact that the sub-optimal algorithm is valuable for complexity and computation time reasons, the results are relatively far from the optimal ones. Hence, we do not consider it as an efficient algorithm to solve the time network synchronization problem considered in this thesis. In order to solve this problem in an efficient way, we propose several alternatives. The first option is to apply the recursive version of the DKF algorithm that we developed in this report because it leads to the optimal solution and requires only local communication. A second alternative is to investigate another sub-optimal solution, such as the method proposed in [17]. This recent work presents a distributed Kalman Filter that estimates sparsely connected, large scale systems (L-banded matrix algorithm). Actually, a smart approximation of the covariance matrix is employed and the authors prove that the solution converges to the global Kalman Filter as the number of bands increases.

89

1n =

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion of nod

es

CKF and CLS Vs. SOA 170 Nodes

CKFSOACLS

10n =

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion of nod

es


CKFSOACLS

30n =

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion of nod

es


CKFSOACLS

50n =

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion of nod

es


CKFSOACLS

Figure 10.12. Comparison between CKF, SOA and CLS (with [ ]0.01,12R U∼ and 0P I≠ ) in Network 3 for different values of n .

90

0 20 40 60 80 100 120 140 160 1800

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

[ (P

CK

F) ii - (P

SO

A) ii ]

/ (P

CK

F) ii

Node number

Variance Comparison in a 170 nodes network

Figure 10.13. Variance comparison between CKF and SOA (with [ ]0.01,12R U∼ and

0P I≠ ) in Network 3.

In summary, we performed several clock synchronization algorithms over different network topologies and compare the results. We obtained that the algorithms based on the Kalman Filter framework give the best results in terms of clock accuracy. The basic Least-Squares algorithm (equivalent to CTP in [14]) outperforms the three hierarchical NTP schemes considered and we have extended the results to some more general situations. We can provide different weights to the measurements according to their accuracy and incorporate a-priori knowledge of the problem. As seen in the simulation results, the decentralized Kalman Filter algorithm constitutes the most appropriate and the most general method for clock synchronization among the proposed algorithms. The clock accuracy is the most precise, it requires only local communication between neighbors and the complexity is not increased. In the last part of this section, we compared the centralized Kalman Filter solution to a simple sub-optimal algorithm that neglects the off-diagonal terms of the inverse covariance matrix 1

0P − . As expected, the optimal algorithm gives improved results with respect to the sub-optimal method, and this is why we needed to develop a decentralized version of the Kalman Filter. Simulation results on the skew estimation problem are not considered in this thesis. As a future work, one may envisage the comparison of the different algorithms of Section 9. In the next section, we present the conclusions and several directions for future research.

91

11. Conclusion and Future Work We developed several decentralized algorithms for estimating the offset at each node in a network with respect to the reference time, using a Kalman Filter framework. These algorithms can be either synchronous or asynchronous, some of which being recursive. In addition, we showed that these decentralized filtering algorithms converge to the optimal centralized solution. The essential characteristic of these algorithms is their decentralized nature; each node can estimate its clock offset only by exchanging packets with its one-hop neighbors. We considered the case where all the clocks run at the same speed, i.e.,

1i jα α= = (there is no skew) as well as the case where i jα α≠ . In the latter case, we developed several different estimation algorithms: one for estimating the clock skews (without knowledge of the offsets) and one for estimating the clock offsets. In practice, we will treat skew synchronization and offset synchronization on different time scales. We obtained that the offset and the skew estimation problems reduce to the same mathematical setup. Hence, we developed several decentralized algorithms for clock synchronization that can treat general clocks with both offsets and skews. We investigate two different approaches. In the first, time offsets and rate offsets are estimated simultaneously, whereas the second is based on a separate estimation of both parameters. In summary, we extended the existing Least-Squares results using the Kalman Filter framework. Namely, we can give different weights to the measurements according to their accuracy and include a-priori knowledge. The main algorithm is both decentralized (requires only local broadcasts), recursive (works in real-time applications) and converges to the optimal centralized solution. As expected, under some appropriate assumptions, we showed that our optimal estimated offsets correspond to the Maximum A-Posteriori estimator (or to the Maximum-Likelihood estimator in the statistical Least-Squares case). We also considered several extensions to the basic case, like the incorporation of a discount factor and the exposure to temporary communication failures. We tested the different algorithms on typical networks and compared the results with several versions of the Network Time Protocol. In most of the cases, the proposed algorithms outperform the NTP schemes and the LS method. In addition, we compared the recursive CKF algorithm to a simple sub-optimal algorithm that neglects the off-diagonal terms of the inverse covariance matrix. Several directions may continue the work performed in this research thesis. Among these, we suggest the following:

• The general clock problem that includes both offset and skew was not solved in an optimal way. It can be interesting to find a decentralized optimal algorithm that converges to the optimal solution and makes use of all the measurements.

• There exist several approaches to performing heuristics that approximate the

Kalman Filter solution. For example, [17] presents a distributed Kalman Filter to estimate sparsely connected, large scale systems (L-banded matrix algorithm). It can be worthwhile to use the same approach to solve the clock synchronization problem.

92

• In this thesis, we gave more importance to theoretical analysis than to numerical simulations. Extending the simulation results to the case where a general clock is considered (including skew estimation), and implementing the algorithm with temporary communication failures could provide fruitful. Moreover, it can be valuable to compare the convergence rate of the different algorithms in a numerical analysis.

• In the general state space problem that includes process noise, we did not achieve

an optimal decentralized algorithm. A further possible direction for future research may be to solve this problem optimally.

• So far, the offsets were considered to be time-invariant and the network topology

static. The next step involves solving the same problem when these assumptions are relaxed, that is, dynamic topologies with time-varying offsets. In the time-varying case, the advantageous properties of the Kalman Filter structure can be exploited.

• As we previously noted, the problem that we considered in this thesis can be viewed

as a general problem related to distributed estimation based on relative measurements in sensor networks. The time synchronization and the sensor localization problems are only special cases. An additional question of interest is to consider a general framework in order to estimate the distances between several cooperative agents. An interesting example is that of a group of aircraft flying in formation, where we seek to estimate the distance to the leader. This problem seems to be very interesting albeit more difficult due to its non-linear nature.

93

References [1] C. Aitken (1935), "On Least-Squares and linear combinations of observations",

Proceedings of the Royal Society of Edinburgh, 55, 42-48. [2] P. Alriksson and A. Rantzer, "Distributed Kalman filtering using weighted

averaging", Proceedings of the 17th International Symposium on Mathematical Theory of Networks and Systems, Kyoto, Japan, 2006.

[3] P. Barooah, and J. P. Hespanha, “Distributed estimation from relative

measurements in sensor networks,” Proceedings of the 3rd Int. Conf. on Intelligent Sensing and Information Processing, 2005.

[4] T. M. Berg and H. F. Durrant-Whyte, "Distributed and decentralized estimation",

Proceedings of the Singapore International Conference on Intelligent Control and Instrumentation, vol. 2, 1992.

[5] D. P. Bertsekas and J. N. Tsitsiklis, Parallel and distributed computations, Prentice

Hall, Englewood Cliffs, NJ, 1989. [6] R. Carli, A. Chiuso, L. Schenato and S. Zampieri, "Distributed Kalman filtering

based on consensus strategies", IEEE Journal on Selected Areas in Communications, vol. 26, no. 4, May 2008.

[7] W. H. Chung and J. L. Speyer, "A general framework for decentralized estimation",

Proceedings of the American Control Conference, Seattle, Washington, June 1995. [8] Mark Coates, "Distributed particle filters for sensor networks", Proceedings of the

3rd International Symposium on Information Processing in Sensor Networks, Berkeley, California, USA, 2004.

[9] J. Elson, L. Girod and D. Estrin, "Fine grained network time synchronization using

reference broadcast", Proceedings of the 5th Symposium on Operating Systems Design and Implementation, pp. 147-163, Boston, Dec. 2002.

[10] J. Elson and K. Romer, "Wireless sensor networks: A new regime for time

synchronization", Tech. Rep., UCLA, July 2002. [11] P. Ferguson and J. How, "Decentralized estimation algorithms for formation flying

spacecraft", AIAA Guidance, Navigation and Control Conference, Austin, Texas, Aug. 2003.

[12] A. Giridhar and P. R. Kumar, “Distributed clock synchronization over wireless networks: algorithms and analysis”, Proceeding of the 45th IEEE Conference on Decision and Control, San Diego, Dec. 2006. [13] G. C. Goodwin and K. S. Sin, "Adaptive filtering prediction and control", Prentice

Hall, Englewood Cliffs, NJ, 1984 (chapter 3).

94

[14] O. Gurewitz, I. Cidon and M. Sidi, “Network time synchronization using clock offset optimization”, IEEE International Conference on Network Protocols (ICNP), pp. 212-221, Atlanta, GA, November 2003.

[15] H. R. Hashemipour, S. Roy and A. J. Laub, "Decentralized structures for parallel

Kalman filtering", IEEE Trans. on Automatic Control, vol. 33, no. 1, pp. 88-94, Jan. 1988.

[16] R. A. Horn and C. R. Johnson, Matrix analysis, Cambridge, UK: Cambridge

University Press, 1991 (pp.344-347). [17] U. A. Khan and J. M. F. Moura, "Distributing the Kalman Filter for large-scale

systems", IEEE Trans. on Signal Processing, 2008. [18] R. E. Kalman, "A New approach to linear filtering and prediction problems",

Transactions of the ASME, vol. 82 (Series D), pp. 35-45, 1960. [19] E. D. Kaplan, editor, "Understanding GPS: principles and applications", Artech

House, 1996. [20] E. D. Kaplan, editor, "Adaptive Kalman Filter for time synchronization over

packet-switched networks", 2nd International Conference on Communication Systems Software and Middleware (COMSWARE), Jan. 2007.

[21] R. Karp, J. Elson, D. Estrin and S. Shenker, “Optimal and global time synchronization in sensornets”, Center for Embedded Networked Sensing, University of California, Los Angeles, Tech. Rep., April 2003. [22] C. T. Kelley, Iterative methods for linear and nonlinear equations, SIAM, Philadelphia, PA, 1995. [23] L. Lamport, "Time, clocks and the ordering of events in a distributed system",

Communications of the ACM, 21(4):558-565, July 1978. [24] A. Mainwaring, D. Culler, J. Polastre, R. Szewczyk and J. Anderson, "Wireless

sensor networks for habitat monitoring", Proceedings of the 1st ACM International Workshop on Wireless Sensor Networks and Applications, ACM press, pp. 88-97, 2002.

[25] J. Mannermaa, K. Kalliomaki, T. Mansten and S. Turunen, "Timing performance of

various GPS receivers", Proceedings of the 1999 Joint Meeting of the European Frequency and Time Forum and the IEEE International Frequency Control Symposium, pp. 287-290, April 1999.

[26] M. Maroti, G. Simon, B. Kusy and A. Ledeczi, "The flooding time synchronization

protocol", Proceedings of the 2ndInternational Conference on Embedded Networked Sensor Systems, pp. 39-49, Baltimore, MD, USA, Nov. 2004.

[27] P. S. Maybeck, Stochastic models, estimation, and control, New York:

Academic Press, Inc, 1979. Republished, Arlington, VA: Navtech, 1994.

95

[28] D. L. Mills, “Internet time synchronization: the network time protocol”, IEEE Trans. Commun., vol. COM-39, no. 10, pp. 1482-1493, Oct. 1991. [29] D. L. Mills, “Improved algorithms for synchronizing computer network clocks”, IEEE/ACM Trans. Netw., vol. 3, no. 3, pp. 245-254, June 1995. [30] D. L. Mills, Network Time Protocol (version 3) specification, implementation and

analysis, Network Working Group Report. Univ. Delaware, RFC-1305, pp. 113, 1992.

[31] M. Mock, R. Frings, E. Nett and S. Trikaliotis, "Continuous clock synchronization

in wireless real-time applications", The 19th IEEE Symposium on Reliable Distributed Systems (SRDS), pp. 125-133, Washington – Brussels – Tokyo, Oct. 2000.

[32] E. M. Nebot, M. Bozorg and H. Durrant-Whyte, "Decentralized architecture for

asynchronous sensors", Autonomous Robots 6, pp. 147-164, April 1999. [33] K. Plarre and P. R. Kumar, "Object tracking by scattered directional sensors",

Proceedings of the 44th IEEE Conference on Decision and Control, pp. 3123-3128, Seville, Spain, Dec. 2005.

[34] R. Olfati-Saber, "Distributed Kalman Filter with embedded consensus filters",

Proceedings of the 44th IEEE Conference on Decision and Control, and European Control Conference, Dec. 2005.

[35] R. Olfati-Saber, J. A. Fax and R. M. Murray, "Consensus and cooperation in

networked multi-agent systems", Proceedings of the IEEE, vol. 95, pp. 215-233, Jan. 2007.

[36] B. S. Rao and H. F. Durant-Whyte, "Fully decentralized algorithm for multisensor

Kalman filtering", IEEE Proceedings-D, Vol. 138, No. 5, September 1991. [37] K. Romer, "Time synchronization in ad hoc networks", ACM Symposium on Mobile

Ad Hoc Networking and Computing (Mobihoc), Long Beach, CA, Oct. 2001. [38] S. I. Roumeliotis and G. A. Bekey, "Collective localization: A distributed Kalman

Filter approach to localization of groups of mobile robots", Proceedings of the IEEE International Conference on Robotics and Automation, San Francisco, CA, April 2000.

[39] S. Roy and R. A. Iltis, "Decentralized linear estimation in correlated measurement

noise", IEEE Transactions on Aerospace and Electronic Systems, July 1991. [40] A. H. Sayed and T. Kailath", A state-space approach to adaptive RLS filtering",

IEEE Signal Processing Magazine, July 1994. [41] R. Solis, V. Borkar, and P. R. Kumar, “A new distributed time synchronization protocol for multihop wireless networks”, Proceeding of the 45th IEEE Conference on Decision and Control, pp. 2734-2739, San Diego, Dec. 13-15 2006.

96

[42] D. P. Spanos and R. M. Murray, "Distributed sensor fusion using dynamic consensus", Proceedings of the 16th IFAC World Congress, July 2005.

[43] J. Stoer and R. Burlirsch, Introduction to numerical analysis, 3rd Ed. Springer Verlag, New York, 2002. [44] J. D. Wolfe and J. L. Speyer, "A low-power filtering scheme for distributed sensor

networks", Proceedings of the 42nd IEEE Conference on Decision and Control, Maui, Hawaii USA, Dec. 2003.

[45] L. Xiao and S. Boyd, "Fast linear iterations for distributed averaging", Systems and

Control Letters, vol. 53, pp. 65-78, 2004. [46] D. Yu, Y. Kim and S. Hwang, "A stable estimation model for time synchronization

on the Internet using Kalman filtering", Embedded and Ubiquitous Computing, pp. 931-941, July 2004.

[47] E. W. Zegura, K. Calvert, and S. Bhattacharjee, “How to model an internetwork”, Proceedings IEEE INFOCOM, pp.594-602, San Francisco, CA, 1996. [48] RFC1769 "Simple Network Time Protocol", RFC- Editor, www.rfc-editor.org.

97

Appendix A – Equivalence between KF and LS We intend to prove the following general theorem about the equivalence between the Kalman Filter and the Least-Squares problem. Assuming the Gaussian linear (time varying) state space model given by:

0( 1) ( ) ( ) ( ) ( ) (0) ( , )( ) ( ) ( ) ( ) ( ) (0, ( )) ; ( ) (0, ( ))

ox n n x n G n w n x N x Py n H n x n v n w n N Q n v n N R n

+ = Φ +⎧⎪⎨ = +⎪⎩

∼∼ ∼

Let us assume that the measurement noise ( )v n and the system noise ( )w n are white, uncorrelated and statistically independent of 0x . We denote the state vector that maximizes the Maximum A-Posteriori (MAP) probability

by ( )( ) ( )0 ,...,

Tn nMAP kx x x= . Let us recall that for the Gaussian case, this is equivalent to the

MMSE estimator, i.e., the state vector obtained by applying the Kalman Filter ( ( )ˆ |x n n ). Additionally, consider the following deterministic optimization problem:

{ } { }

( ) ( )1

1 10 0 0

0

,1

0

1

1 1(0) (0) ( ) ( )2 2

min1 ( ) ( )2

. . , 0,..., 1

k k

nT T

n k k kk

nx wT

k k k k k k kk

k k k k k

J x x P x x w Q w

y H x R y H x

s t x x G w k n

−− −

=

−

=

+

⎧ ⎫= − − + +⎪ ⎪⎪ ⎪⎨ ⎬⎪ ⎪+ − −⎪ ⎪⎩ ⎭

= Φ + = −

∑

∑ (A.1)

Here, 0x and { }0

n

k ky

= are given vectors, and 1 1

0 , ,k kP R Q− − are symmetric positive-definite

matrices. The previous constrained optimization problem is by definition a Least-Squares problem. Let us denote its optimal solution by LSx . Theorem The minimizing solution of the above LS problem is equal to the MAP estimator (and to the MMSE solution), i.e., under the above conditions:

LS MAP MMSEx x x= = (A.2)

98

Proof Let us write the expression of 0 0( ,..., , ,..., )n nP x x y y :

0 0 0 0 1( ,..., , ,..., ) ( , , , 0,..., 1, 0,..., )n n i i i i i j j j jP x x y y P x x x x G w y H x v i n j n+= = −Φ = − = = − =

Without loss of generality, we can assume that kG I= . Recalling that each term is Gaussian and that all the terms are independent, we can obtain:

( ) ( )

( ) ( )

10 0 0 0 0 0 0

11 1

0 0

1( ,..., , ,..., ) exp2

1 1exp exp ( ) ( )2 2

T

n n

n nT Ti i i i i i i j j j j j j j

i j

P x x y y const x x P x x

x x Q x x y H x R y H x

−

−− −

= =

⎛ ⎞= ⋅ − − − ⋅⎜ ⎟⎝ ⎠

⎛ ⎞ ⎛ ⎞⋅ − −Φ −Φ ⋅ − − −⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠

∏ ∏

On the other hand, we can compute the MAP estimator:

( ) ( )( ){ } ( )( )

( ){ } ( ){ }{ }

0 00 0

0

0 0 0 0

,..., , ,...,arg max ,..., ,..., arg max

,...,

arg max ,..., , ,..., arg max log ,..., , ,...,

arg min

n nMAP n n

n

n n n n

n LS

P x x y yx P x x y y

P y y

P x x y y P x x y y

J x

⎧ ⎫⎪ ⎪= = =⎨ ⎬⎪ ⎪⎩ ⎭

= = =

= =

■

99

Appendix B - Maximum-Likelihood and Maximum A-Posteriori Estimators Consider the Weighted Least-Squares case, i.e., the objective function is given by:

( )2

1

,

1 ˆ( ) ( )

i

T T Tji i j

i j jij N

J y A x R y A x Or

τ τ−

∈

= − − = − +∑

As we have shown previously, the optimal decentralized solution for estimating the offset at each network node according to a single set of measurements is given by:

( )1 1 ˆ ( .1)1i

i

i ji jjij N

jij N

O Br

r

τ τ∈

∈

= ⋅ +∑∑

We will now compute the Maximum-Likelihood Estimator of the offsets and under some basic assumptions; we will obtain the same expression as in (B.1). As usual, the measurements model we used is:

( ) ( )1ˆ2ji ij ji i j ij

ijO

O T T τ τ ε= Δ −Δ = − +

Here, ijε is the estimation error of the relative offset between node iΛ and node jΛ (by

definition ij jiε ε= ) .We will further assume that these random variables are independent for any , 1,... ( )i j N i j= ≠ . Let us write the expression for the probability of the estimated offset ˆ

ijO given the values

of the vector τ under the assumption that the errors ijε are Gaussian random variables

with mean zero and variance ijr :

( )( )2ˆ

21ˆ |2

ij i j

ij

O

rij

ijP O e

r

τ τ

τπ

− − +

= (B.2)

The joint probability of the estimated offsets ˆ

ijO given the values of the vector τ

( ,i j∀ such that ij N∈ ) is given by (assuming that ijε are independent):

( )2ˆ

2

,

12

ij i j

ij

i

O

r

iji j NP e

r

τ τ

π

− − +

∈= ∏

100

Taking the logarithm function on P , differentiating it with respect to iτ and setting it to zero will lead to the same solution as in the Weighted-Least-Squares case. Under the assumptions that the estimation errors are both Gaussian (with zero mean) and independent random variables, our previous optimal estimated offsets (for the WLS case) also correspond to the maximally likely set of offset assignments. Moreover, it was proved in [21] that the Maximum-Likelihood estimated offsets are similar to the minimum-variance unbiased estimated offsets. The next step is to compute the Maximum A-Posteriori estimator for the general problem where the objective function is composed of two distinct terms:


We notice that this case is a Bayesian case and thus the Maximum-Likelihood estimator is irrelevant, but we may compute the MAP estimator instead. We consider the general case, where the matrix 0P is not assumed to be diagonal. The joint

probability of the estimated vector τ given the values of the estimated offsets ˆijO

( ,i j∀ such that ij N∈ ) is given by (assuming that ijε are independent):

( ) ( ) ( )

( )

( )( )

( )

( )( ) ( ){ }

2ˆ

2

,

0 0

1 10 0 02

ˆ |ˆ|

ˆ

1ˆ |2

,

ijij

ij

Oij i jrji

iji j N jii

Tx P x

Bayes

P O PP P O

P O

P O er

N x P

P e

τ τ

τ τ

τ ττ

τπ

τ

τ

− − +

∈

−− − −

↑

⋅= =

=

∝

∏

∼

This implies:

( )( )

( ) ( ) ( ) ( )1 10 0

110

1 1 ˆ (0) (0) ( .3)1

N

i ji j i k kMAP ii ikj N kjii

k iii

j N jii

O P P Br

Pr

τ τ τ τ τ− −

∈ =−≠

∈

⎡ ⎤⎢ ⎥= + + − −⎢ ⎥⎛ ⎞⎢ ⎥+ ⎣ ⎦⎜ ⎟⎜ ⎟

⎝ ⎠

∑ ∑∑

In this case, we assumed that each offset is a Gaussian random variable that is independent of the other offsets. Moreover, we used the fact that ( )ˆ

ijP O is independent of τ (it comes

directly from the measurements: ( )1ˆ2ij ij jiO T T= Δ −Δ ) and then, does not affect the

differentiation. Under these assumptions, we obtain that the Maximum A-Posteriori estimator in (B.3) is the same as the optimal estimator computed previously.

101

Appendix C – CTP Numerical Results We compare the CTP algorithm to three different hierarchical versions of NTP and then analyze the convergence rate of the decentralized CTP algorithm. First, we consider Network 1 (400 nodes with relatively high connectivity of 1798 edges). Figure C.1 shows the fraction of nodes with clock offset with respect to the reference time node that is not grater than t for the 4 different algorithms. In other words, the y-axis represents the fraction of nodes with clock offset, with respect to the UTC, not greater than the value described by the x-axis. Figure C.1 clearly demonstrates the significant improvement of CTP (Least-Squares approach) over all the hierarchical NTP schemes considered here, in terms of clock accuracy.

0 5 10 15 20 25 300

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion

of n

odes

400 Nodes

CTP

NTP-1

NTP-2

NTP-3

Figure C.1. Distribution of the clock offsets for each algorithm in Network 1.

102

Figure C.2 presents the results for the same network topology, but in a different way. The y axis shows the Probability Density Function (PDF) with clock offsets described by the x-axis.

-25 -20 -15 -10 -5 0 5 10 15 20 250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Clock offset

Frac

tion

of n

odes

400 Nodes

CTPNTP-1NTP-2NTP-3

Figure C.2. Probability Density Function of the clock offsets for each algorithm in Network 1.

The next figure depicts the clock offset dispersion around the UTC clock in the same 400 node network. The x-axis corresponds to the node ID, whereas the y-axis corresponds to the clock offset with respect to the UTC clock after performing each one of the different algorithms. It can be seen that as expected, the CTP algorithm keeps all the estimated offsets in a very narrow region which means small errors and small dispersions in the clocks. The other schemes are characterized by a much wider domain. Furthermore, the thickness of the region in CTP is the same for all the nodes over the network, whereas in the NTP schemes it slightly depends on the ID. This can be explained by the fact that the NTP schemes are hierarchical and, as a consequence, the closest the nodes are from the reference (small ID), the most accurate are the offsets.

103

-10

0

10

CTP

-10

0

10

NTP

-1

-10

0

10

NTP

-2

-10

0

10

NTP

-3

Node ID

400 Nodes

Figure C.3. The clock offsets dispersion in Network 1.

The next part of this section is dedicated to the convergence analysis of the distributed CTP algorithm. From now, we consider Network 2 (400 nodes with 997 edges).

We examined the clock offsets after 0, 1, 3, 5 and 10 iterations with respect to the optimal centralized solution. Figure C.4 describes the fraction of nodes with clock offset relating to the set of optimal values not greater than t in Network 2. We start with clock offsets that are uniformly distributed (0 iteration). It can be seen in the graph that before we start, we have very few nodes (less than 10%) that are synchronized, but 32%, 55%, 77% and 84% of the nodes are within one time unit of the optimal solution after the first, third, fifth and tenth iteration respectively.

104

0 1 2 3 4 5 6 70

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Frac

tion

of N

odes

Distributed CTP 400 Nodes

0 iterations

1 iteration

3 iterations

5 iterations

10 iterations

Figure C.4. Rate convergence analysis of the decentralized CTP algorithm in Network 2.

I

תקציר

פתרונות יעילים לבעיית סנכרון שעונים ברשת עיצ באלגוריתמי שערוך על מנת להעבודה זו דנהרמת . מדויק מהווה דרישה בסיסית עבור רוב מערכות המחשבים המבוזרותון שעוניםסנכר .מחשבים

ק הסנכרון בין הביצועים של יישומים רבים המבוססים על תאום בין ישויות שונות תלויה במידה רבה בדיו. ים אלחוטייםחיישנ בעיית עקיבה בעזרת רשת דוגמא לכך היא. השעונים השונים הנמצאים במערכת

המרוחקים האחד , ששעונים ברשת התקשורתהמטרה של פרוטוקול שמסנכרן שעונים היא להבטיח חשבים הינה מהדרך המקובלת לסנכרן שעונים ברשת . בצורה אחידה ככל הניתןשעהימדדו את ה, מהשני

בין הישויות המבוזרות על מנת לתאם בין )Probe Packets( באמצעות החלפת הודעות סטנדרטיות , טופולוגית קבועה בזמןה ש, סימטריים ברשתנניח לשם פשטות שהקשרים, בעבודה זו. הזמנים שלהם

בספרות קיימות גישות רבות .ושכל צומת ברשת מסוגל לשלוח ולקבל הודעות מהצמתים השכניםהסטנדרט המקובל היום לסנכרון . פתרונות יעילים לבעיית סנכרון שעונים ברשתות מסורתיותהמציעות

.NTP (Network Time Protocol) [Mills, 1991, 92 and 1995]- הוא ה, למשלשעונים באינטרנט

CTP (Classless Time Protocol) [Gurewitz, Cidon- האלגוריתם : גישה חדשהההוצע, לאחרונהand Sidi, 2003 .[ היררכית ומפחית את שגיאות השעונים מבלי להגדיל את הפרוטוקול פועל בצורה לא

תורת האופטימיזציה הקמורה על מנת להעריך את ההשפעה את גישה זו מנצלת.NTP- ביחס להסבוכיות מבוססת על משערך גישה נוספת. על פונקצית המטרה הכללית) Clock Offsets(של כל התזוזות בזמן דיוק. Least-Squares Estimator( [Solis, Borkar and Kumar, 2006](הריבועים הפחותים

העובדה דהיינו( אילוצים הקשורים לטופולוגית הרשת התחשבותי " עמושג של סנכרון השעונים משופרסינכרוני הדורש -א ושימוש באלגוריתם מבוזר) מתאפס סגוריםסכום תזוזות השעונים לאורך מעגליםש

ל הינו המבנה המבוזר הדורש תקשורת מקומית " המאפיין המרכזי של השיטות הנ.תקשורת מקומית בלבד והאלגוריתם שמבוסס על שיטת הריבועים CTP-ניתן להראות שאלגוריתם ה .עם השכנים בלבד

.ים לחלוטיןשקולהפחותים

מסנן קלמן הינו משערך המצב , הנחה הגאוסית תחת התמערכת דינמית ליניאריעבור , בתורת השערוךהפתרון יוביל , ללא ההנחה הגאוסיתMMSE). (ת במובן שגיאה ריבועית ממוצעת מינימאלייהאופטימאל

מבוזרת נחקרה היישום של מסנן קלמן בצורMMSE. האופטימאלי במובן י הליניארלמשערך מצב לפתח אלגוריתמי שערוך יעילים על מנת לסנכרן מטרתנו הינה. 3 שנראה בפרק פיב בספרות כרחבאופן נ

ללא. שיפור הביצועים בהשוואה לאלגוריתמים הקיימיםךאת כל השעונים ברשת ביחס לשעון הייחוס תו ולכן מספיק לסנכרן את י מסונכרן עם השעון האוניברסאל1נוכל להניח שצומת מספר , הגבלת הכלליות

כלומר אין , זההקצבאנו נניח שכל השעונים רצים ב, ראשון בשלב .כל יתר הצמתים ברשת ביחס אליו של כל השעונים ברשת ביחס הזמן ואז המטרה שלנו היא לשערך את תזוזות)Clock Skew (קצב הבדלי

.לשעון הייחוס

אנו נרחיב את גישת הריבועים הפחותים בעזרת פיתוח אלגוריתמי שערוך של תזוזת השעון , במחקר זההשלב הראשון יהיה לייצג את המערכת במודל מישור . רשת המבוססים על סינון קלמןשל כל צומת ב

אנו נראה שניתן לבצע ,בנוסף. שעון של כל צומת ברשתתזוזותי " וקטור המצב נתון עמצב כאשרההמסגרת .המתכנסת למשערך האופטימלי המרוכז, עדכון מדידה בודדת בעזרת סכימה מבוזרת איטרטיבית

ולהעניק משקלים שונים למדידות ביחס ,למן מאפשרת לנצל את הידע האפריורי על המערכתשל מסנן ק נשים לב כי ,ולם א.הקווריאנס ההתחלתית הינה אלכסוניתנניח באופן טבעי שמטריצת . ן ולדיוקןלאיכות

מטריצת הקווריאנס מאבדת את המבנה , של המסנן קלמן הראשונהאחרי השלב של עדכון המדידהלא יהיו מבוזרות ואז כל צומת החל משלב זה המשוואות של המסנן קלמן, חרותבמילים א. שלהוניהאלכסעבור רשתות גדולות זמן התקשורת כי , כמובן מצב לא רצויוזה. לתקשר עם כל צומת אחרחמוכרברשת

יה של המבוסס על מניפולצאנו נפתור בעיה זו בעזרת אלגוריתם רקורסיבי מבוזר. יהיה בלתי נסבלנסתמך על משפט שטוען שהפתרון של המסנן קלמן שקול לפתרון של .המשוואות הסטנדרטיות

)LS (קבל את המקרה הקיים בספרותנוכל למינימיזציה מאולצת על פונקצית מטרה דטרמיניסטית ואז המטרה לפי כל יש לגזור את פונקצית , על מנת למצוא את הפתרון האופטימאלי. כמקרה פרטי

כדי לממש .)נשווה את הגרדיאנט לאפס, יבמקרה המרכז (קואורדינאטה במקרה המבוזר ולהשוות לאפס

II

סינכרוני ונראה התכנסותו איטרטיבי אלגוריתםב נבחראנו ,את המשוואה האופטימאלית המתקבלתשל אלגוריתם משמעותה. אי שליליותה בעזרת כלים מתורת המטריצות לפתרון האופטימאלי המרוכז

, עד כה. ן שלו באמצעות תקשורת עם השכנים בלבדשעומבוזר הינה שכל צומת ברשת מחשב את תזוזת השבו קיימות מספר למקרה את הטיפולרחיבנשלב הבא ב. יחסנו למקרה שרק מדידה אחת זמינההתי

בן על המבנה שומר כמוה תםנציג גרסה רקורסיבית של האלגורי ו,קבוצות של מדידות בזמנים שוניםוון שהוא מבוזר ומתכנס לפתרון י של העבודה מכת המרכזיהסכימהזה מהווה את אלגוריתם . המבוזר שלו

תחשוב לציין שהפתרון זהה לפתרון של המסנן קלמן בצור. MMSEהאופטימאלי המרוכז במובן וזות הזמן בנוסף לחישוב של תז. אך המבנים של שתי השיטות שונים בצורה מהותית,אינפורמציה

נותן לנו ה דבר ,ת השערוךוהאלגוריתם מניב את השונויות של שגיא, המשוערכות לכל צומת ברשתאופטימאלי הפשוט שמזניח את -אנו נתייחס לאלגוריתם התת, בנוסף.אינדיקציה על טיב השערוך

ניכרת את ה בצורה ורידשיטה זו מ. האיברים מחוץ לאלכסון של המטריצה ההפוכה למטריצת הקווריאנסתוצאות הסימולציה באנו נראה בפרק שדן . הסיבוכיות אך מאבדת את תכונת האופטימאליות

. ולא מהווה פתרון יעיל לבעיהחלשיםשהאלגוריתם האחרון משיג ביצועים

לפונקציית המחיר הוספה של מקדם היווןהכוללות, נדון במספר הרחבות של האלגוריתם הבסיסי נתאים את האלגוריתם כך שיהיה עמיד ,כמו כן. עש מערכת במשוואת הדינאמיקהשל רהוספה והריבועית

ביחד עםזוזה בזמן התשלשערוך בעיית הלנתייחס בקצרה ,בנוסף. בפני שגיאות זמניות בתקשורתרת המתמטית תחת לאותה המסגמצטמצמותשתי הבעיות נראה ש ואנ(Clock Skew). קצבהתזוזה ב

שערוך הקצב , בגישה אחת .ות שונותגישנציע , קצב בעיית שערוך תזוזת העבור. ההצבות המתאימותנציע שערוך אופטימלי משולב של תזוזות הקצב והזמן , בגישה השנייה. מתבצע בנפרד לשערוך התזוזה

.בעזרת מסנן קלמן

ת טופולוגיות רשת שונות על מנת להעריך ולהשוות אבורנציג מספר תוצאות סימולציה ע, לאחר מכןועים מבחינת דיוק משפרת את הביצ מסנן קלמןה שלגישהנקבל כי , כצפוי. השונותמותיהסכהדיוק של

מסנן קלמן מניב תוצאות משופרות בהשוואה האלגוריתם שמבוסס על , לים אחרותבמי. סנכרון השעונים השוואות נבצע.CTP- ואלגוריתם הNTP גרסאות שונות של :כגון, לשיטות האחרות הקיימות בספרות

הכללה של ידע התחלתי ומימוש של האלגוריתם , הוספת מטריצת משקול למדידות: הכוללותשונות אנו נסיק שהשיטה המוצעת הינה הרחבה של האלגוריתם . מדידות שלקבוצות הרקורסיבי עבור מספר

מולציה נציין כי תוצאות הסי. ותחת התנאים המתאימים נותנת את הביצועים הטובים ביותר,הקיים . שכל השעונים ברשת רצים במהירות זההמתבצעות תחת ההנחה

שערוך מבוזר כללית שלבעיית סנכרון שעונים ברשת הינה שקולה מבחינה מתמטית לבעיה כינצייןניתן ליישם את אותם האלגוריתמים , לדוגמה. אדיטיביות ברשתות סנסוריםזרת מדידות יחסיותבע

בעזרת הרחבה וקטורית , במישור או במרחב)Sensor Localization Problem( סנסוריםכוןלבעיית אי .7 אנו נפרט על כך בקצרה בפרק .פשוטה

נבחן את הרקע המדעי הנחוץ ונביא סקירה מקיפה של , 3- ו2 יםבפרק. מבנה העבודה הזו הוא כדלקמן

את האלגוריתמים השונים נציג , 5 בפרק . הבעיה אתנסחנתאר את המודל ונ, 4בפרק . הספרות בהתאמהמוקדש להוכחת 6פרק ). הן בגרסה המרוכזת והן בגרסה המבוזרת (בודדתמקרה של מדידה עבור ה

אנו נספק את הגרסה , 7בפרק .לפתרון האופטימאלי המרוכז המבוזרהתכנסות של האלגוריתם ה דנים 9-ו 8 פרקים . ואלגוריתם נוסף לא רקורסיבי עבור מדידות מרובותרסיבית של האלגוריתםהרקו

תוצאות . קצב הוספת מקדם היוון והשערוך של תזוזת הגוןכ, מקרה הבסיסישל הבמספר הרחבות .11 בפרק צוינות מחקר עתידי מאודות הערות מספרמסקנות ו, בסוףל. 10סימולציה מוצגות בפרק

א

תוכן עניינים

1 אנגלית ב תקציר

5 ואבמ1.

7 מדעי רקע . 2

7.........................................................................................שיטת הריבועים הפחותים 2.1

10........................................................................................................... סינון קלמן2.2

12.................................................................................סנן קלמן בצורת אינפורמציה מ2.3

12.....................................................................................רפיםעובדות מתוך תורת הג 2.4

14................................................................................................... אנליזה מטריצית 2.5

15 סקר ספרות. 3

15................................................................................... לסינכרון שעוניםים פרוטוקול3.1 18........................................................................................................שערוך מבוזר 3.2 20...............................................................................................אלגוריתמי קונצנזוס 3.3

21 הגדרת המודל וניסוח הבעיה. 4

21..........................................................................................................שעון מודל ה4.1

21.............................................................................................................. המדידות4.2

22.........................................................................................................ניסוח הבעיה 4.3

22.................................................................................………………… מודל מצב4.4

24 עדכון מדידה בודדת :וזרבאלגוריתם מרוכז ואלגוריתם מ. 5

24................................................................................................ משוואות מסנן קלמן5.1 25..........................................................................................גישת הריבועים הפחותים 5.2 26......................................................... האופטימלי הפתרון משוואות:תם המבוזר האלגורי5.3

30...............................................................המשוואות האיטרטיביות: האלגוריתם המבוזר5.4

33 ניתוח התכנסות. 6

39 מרובות מדידותעדכון: קורסיביאלגוריתם ר. 7

39....................................................................................... מרוכזמסנן קלמן אלגוריתם 7.1

40.................................................................................. האופטימלי המבוזרהאלגוריתם 7.2

ב

48................................................................................ מסנן קלמן שקילות עם משוואות7.3

49................................................................................ מבוזרטימאליאופ- אלגוריתם תת7.4

52................................................................................... לא רקורסיבימבוזר אלגוריתם 7.5

54 הרחבות . 8

54................................................................................................. הוספת מקדם היוון8.1

57.......................................................................................... הוספת רעש מערכת לבן8.2

60............................................................................. סביבה הכוללת שגיאות בתקשורת8.3

61 שעון ה קצב שערוך . 9

62...........................................................................קצב של תזוזות זמן ומשולב שערוך 9.1

65..............................................................................קצבשל תזוזות זמן ו נפרד שערוך 9.2

73תוצאות סימולציה . 10

73............................................................................................ תיאור האלגוריתמים10.1

74................................................................................ הנבדקותתתו טופולוגיית הרש10.2

76................................................................................................... תוצאות גרפיות10.3

91 מסקנות ועבודה עתידית . 11

93 רשימת מקורות

97 שקילות בין מסנן קלמן ושיטת הריבועים הפחותים -נספח א

MAP 99-ית ובירות מירב משערכי ס- בנספח

CTP 101 תוצאות בימולציה עבור -נספח ג

I תקציר

נחום שימקין' המחקר נעשה בהנחיית פרופ בפקולטה להנדסת חשמל

הכרת תודה

. על התמיכה ועל האווירה המצויינת בכל שלבי המחקר, נחום שימקין על הנחייתו' ברצוני להודות לפרופ יתודה מיוחדת להורי. הזאתזהי שלהם על התרבות הערךתובנות העבור ברצוני להודות לבוחנים , בנוסף

.עצום על מצוינות אקדמיתהדגש בפרט על ה, ליהעניקועל האהבה שלהם ועל החינוך ש

אני מודה לטכניון על התמיכה הכספית הנדיבה בהשתלמותי

סנכרון שעונים ברשת בעזרת מסנן קלמן מבוזר

חיבור על מחקר

דרישות לקבלת התואר מגיסטרלשם מילוי חלקי של ה למדעים

מקסים כהן

. מכון טכנולוגי לישראל–הוגש לסנט של הטכניון

2009 קטוברוא חיפה ט"תשס חשוון

סנכרון שעונים ברשת בעזרת מסנן קלמן מבוזר

מקסים כהן

network time synchronization using decentralized kalman...

Documents