a combined stochastic and greedy hybrid estimation capability...

48
A Combined Stochastic and Greedy Hybrid Estimation Capability for Concurrent Hybrid Models with Autonomous Mode Transitions Lars Blackmore a,, Stanislav Funiak b and Brian C. Williams a a Massachusetts Institute of Technology, 77 Mass. Ave, Cambridge, MA 02139 b Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213 Abstract Probabilistic hybrid discrete/continuous models, such as Concurrent Probabilistic Hybrid Automata (CPHA) are convenient tools for modeling complex robotic sys- tems. In this paper, we present a novel method for estimating the hybrid state of CPHA that achieves robustness by balancing greedy and stochastic search. To ac- complish this, we 1) develop an efficient stochastic sampling approach for CPHA based on Rao-Blackwellised Particle Filtering, 2) perform an empirical comparison of the greedy and stochastic approaches to hybrid estimation and 3) propose a strat- egy for mixing stochastic and greedy search. The resulting method handles nonlinear dynamics, concurrently operating components and autonomous mode transitions. We demonstrate the robustness of the mixed method empirically. Key words: Fault detection, hybrid systems, robust sensing, graceful system degradation 1 Introduction Robotic and embedded systems have become increasingly pervasive in a vari- ety of applications. Space missions, such as Mars Science Laboratory [1] have increasingly ambitious science goals, such as operating for longer periods of time and with increasing levels of onboard autonomy. Here on Earth, robotic assistants, such as CMU’s Pearl [2] directly benefit people in ways ranging from providing health care to performing rescue operations. Corresponding Author Email addresses: [email protected] (Lars Blackmore), [email protected] (Stanislav Funiak), [email protected] (Brian C. Williams). Preprint submitted to Elsevier 2 August 2006

Upload: others

Post on 29-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

A Combined Stochastic and Greedy Hybrid

Estimation Capability for Concurrent Hybrid

Models with Autonomous Mode Transitions

Lars Blackmore a,∗, Stanislav Funiak b and Brian C. Williams a

aMassachusetts Institute of Technology, 77 Mass. Ave, Cambridge, MA 02139bCarnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213

Abstract

Probabilistic hybrid discrete/continuous models, such as Concurrent ProbabilisticHybrid Automata (CPHA) are convenient tools for modeling complex robotic sys-tems. In this paper, we present a novel method for estimating the hybrid state ofCPHA that achieves robustness by balancing greedy and stochastic search. To ac-complish this, we 1) develop an efficient stochastic sampling approach for CPHAbased on Rao-Blackwellised Particle Filtering, 2) perform an empirical comparisonof the greedy and stochastic approaches to hybrid estimation and 3) propose a strat-egy for mixing stochastic and greedy search. The resulting method handles nonlineardynamics, concurrently operating components and autonomous mode transitions.We demonstrate the robustness of the mixed method empirically.

Key words: Fault detection, hybrid systems, robust sensing, graceful systemdegradation

1 Introduction

Robotic and embedded systems have become increasingly pervasive in a vari-ety of applications. Space missions, such as Mars Science Laboratory [1] haveincreasingly ambitious science goals, such as operating for longer periods oftime and with increasing levels of onboard autonomy. Here on Earth, roboticassistants, such as CMU’s Pearl [2] directly benefit people in ways rangingfrom providing health care to performing rescue operations.

∗ Corresponding AuthorEmail addresses: [email protected] (Lars Blackmore), [email protected]

(Stanislav Funiak), [email protected] (Brian C. Williams).

Preprint submitted to Elsevier 2 August 2006

Page 2: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

In order to act robustly in the physical world, robotic systems must handle theuncertainty and partial observability inherent in most real-world situations.Robotic systems often face unpredictable, harsh physical environments andmust continue performing their tasks (perhaps at a reduced rate), even whensome of their subsystems fail. For example, in land rover missions, such asMSL, the robot needs to detect when one or more of its wheel motors fail,which could jeopardize the safety of the mission. The rover can detect thefailure from a drift in its trajectory and then compensate for the failure, eitherby adjusting the torque to its other wheels or by replanning its path to thedesired goal.

In our previous work we have developed methods for estimating the state ofsystems that evolve in a discrete manner, where system behavior is modeled byConcurrent Probabilistic Constraint Automata (CPCA), one automaton percomponent[3][4]. In many situations, however, a purely discrete model is insuf-ficient. Probabilistic hybrid models therefore represent the system with bothdiscrete and continuous state variables that evolve probabilistically accord-ing to a known distribution. The discrete state variables typically representa behavioral mode of the system, while the continuous variables represent itscontinuous dynamics. Probabilistic hybrid models can be used to provide anappropriate level of modeling abstraction when purely discrete, qualitativemodels are too coarse, while purely continuous, quantitative models are toofine-grained.

In this paper, we investigate the problem of estimating the state of systemswith probabilistic hybrid models. Given a sequence of control inputs and noisyobservations, our goal is to estimate the discrete and continuous state of thehybrid system. Probabilistic hybrid models are particularly useful for faultdiagnosis, the problem of determining the health state of a system. Withhybrid models, fault diagnosis can be framed as a state estimation problem,by representing the nominal and fault modes with discrete variables and thestate of the system dynamics with continuous variables.

Our previous work [5] developed the HME (Hybrid Mode Estimation) sys-tem, which extended our work on discrete model-based diagnosis for Con-current Probabilistic Constraint Automata, to reason about probabilistic hy-brid models known as Concurrent Probabilistic Hybrid Automata [5]. As withCPCA, CPHA represent the system as a collection of concurrently operatingautomata, one automaton for each component in the system.

In this paper we present a novel method for hybrid estimation with CPHA thatachieves robustness by balancing greedy and stochastic search. The key insightbehind the new algorithm is that, in many AI and optimization methods,a combination of stochastic and greedy search methods can be effective inpractice. This is analogous to the ‘exploration vs. exploitation’ tradeoff, which

2

Page 3: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

has been used with great success in Constraint Satisfaction Problems (CSP)[6] and reinforcement learning [7], for example. Previous methods for hybridestimation have used either greedy search [8] or stochastic sampling [9]. In thispaper we show empirically that these methods can have limited performancedepending on whether the belief state is concentrated in a few mode sequences,or is relatively flat. The mixed greedy/stochastic method, by contrast, is robustto changes in the variance of the posterior distribution.

Developing this method requires three main technical contributions. First,we introduce an efficient stochastic sampling approach for CPHA based onRao-Blackwellised Particle Filtering. Second, we perform an empirical studyof the greedy and stochastic CPHA methods on a simulated acrobatic robotexample. Third, based on the comparative insights gained we propose a mixedexploration/exploitation strategy, and demonstrate its superiority over theseparate approaches.

The paper is organized as follows. In Section 2 we consider existing methods forestimation with hybrid models. In the general case, inference in hybrid mod-els is NP-Hard[10]. Approximate inference using techniques in k-best Gaussianfiltering [8,11] have been effective for hybrid state estimation. These methodsrepresent the system state as a mixture of Gaussians that are enumeratedin decreasing order of likelihood. Diagnostic errors can occur, however, if thecorrect diagnosis is not among the leading set of hypotheses. In this paper,we demonstrate this empirically. An alternative approach is to use stochas-tic sampling to allow the system to perform greater exploration, rather thanperforming a purely greedy search. An example of such a method is ParticleFiltering, which approximates the posterior state distribution using a finitenumber of particles[12][13]. Examples of diagnosis systems include [9] and[14]. These particles are evolved stochastically and resampled based on anappropriately defined importance weighting. The inefficiency of sampling inhigh-dimensional spaces has limited the effectiveness of Particle Filtering forstate estimation in hybrid systems[15]. To avoid these problems, we thereforepropose an efficient stochastic sampling approach for CPHA based on Rao-Blackwellised Particle Filtering. Rao-Blackwellised Particle Filtering exploitsa tractable substructure in the underlying model by sampling only a subspaceof the system state, while estimating the remainder using an efficient analyticalmethod.

Rao-Blackwellised Particle Filtering has been used for hybrid state estimationin Switching Linear Dynamic Systems[13,16], in which the discrete state d is aMarkov chain with a known transition probability p(dt|dt−1), and the contin-uous state evolves linearly, with system and observation matrices dependenton dt. In many domains, however, simple Markovian transitions p(dt|dt−1) arenot sufficiently expressive[5,17]. In these domains, the transitions of the dis-crete variables depend on the continuous state. Such transitions are called au-

3

Page 4: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

tonomous. Furthermore, many systems consist of several interconnected com-ponents, each of which is in its own behavioral mode. Representing the jointmode of all the components would be inefficient. In these cases, it is desirableto represent the mode with several mode variables. Finally, many systemshave nonlinear dynamics. Each of these properties - autonomous transitions,interconnected components, and nonlinear dynamics, are expressed naturallyin CPHA.

In Section 3, we extend Rao-Blackwellised Particle Filtering to handle au-tonomous mode transitions, nonlinearities and concurrency. Applying Rao-Blackwellisation schemes to models with autonomous transitions is difficult,since the discrete and continuous state spaces of these models are closely cou-pled. The key innovation in our algorithm is that it reuses the continuous stateestimates in the importance sampling step of the particle filter. We extend theclass of autonomous transitions that can be handled over our previous workin [5] and [18] to multivariable linear transition guards, in the case of piece-wise constant transition distributions, and to piecewise polynomial transitiondistributions of arbitrary order, in the case of single variables. Our RBPF al-gorithm handles nonlinear dynamics by using an Extended Kalman Filter [19]or an Unscented Kalman Filter [20]. We presented this innovation in [18], thiswas also proposed independently by [21].

These developments provide the foundation for a unified treatment of k-bestenumeration and RBPF approaches to hybrid state estimation. Both theseapproaches represent the belief state by a mixture of Gaussians for a subset ofmode trajectories traced by the discrete state. The former approach enumer-ates the trajectories in best-first order, while the latter evolves them throughsampling. While prior work [21] has compared the performance of RBPF toother particle filters, there has been little empirical comparison of RBPF andk-best methods. Such an analysis is crucial to understanding the trade-offsbetween the two methods, and to developing a new approach that combinesthe strengths of both. In Section 6 we carry out the comparison and show thatboth approaches have limitations, depending on whether the posterior distri-bution is concentrated in a few discrete mode trajectories, or is relatively flatacross many different trajectories. These results demonstrate the need for anew algorithm that is robust to changes in the variance of the approximatedposterior distribution.

In Section 5 we develop such an algorithm. The new algorithm uses Rao-Blackwellised particle filtering to generate, stochastically, additional candi-dates to add to k-best enumeration that would not have been tracked by apurely greedy approach. The algorithm maintains a set of particles, which areupdated using RBPF, and a set of mode trajectories with the highest posteriorprobability, generated by both RBPF and k-best successor enumeration. Thisalgorithm makes use of the efficient properties of both k-best enumeration and

4

Page 5: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

RBPF, while being probabilistically sound.

Using a simulated acrobatic robot, we demonstrate that the mixed algorithmis effective for both a concentrated and flat posterior distribution. The mixedalgorithm shows a dramatic increase in robustness for a small performancepenalty.

2 Models and State Estimation Methods

Probabilistic hybrid models and hybrid estimation methods date back to the1970s [22] and are useful in many applications, including visual tracking [23]and fault diagnosis [5,24]. In this section, we review a formalism for model-ing probabilistic hybrid systems, known as Concurrent Probabilistic HybridAutomata. We then define the hybrid state estimation problem and outlinethe existing approach to hybrid state estimation for CPHA, based on greedyenumeration. This lays the groundwork for the new approach, based on Rao-Blackwellised Particle Filtering, which is described in Section 3.

2.1 Concurrent Probabilistic Hybrid Automata

We have previously developed Concurrent Probabilistic Hybrid Automata(CPHA) [8], a formalism for modeling engineered systems that consist of alarge number of concurrently operating components with non-linear dynamics.A CPHA model consists of a network of concurrently operating ProbabilisticHybrid Automata (PHA), connected through shared continuous input/outputvariables. Each PHA represents one component in the system and has both dis-crete and continuous hidden state variables. The automaton interacts with theother automata in the surrounding world through shared continuous variables,and its discrete state determines the evolution of its continuous variables.

Definition 1 A Probabilistic Hybrid Automaton is a tuple 〈x,w, F, T,X0,Xd,Ud〉[5]:

• x denotes the hybrid state of the automaton. x � xd ∪ xc where xd denotesthe discrete state variables xd ∈ Xd and xc denotes the continuous statevariables xc ∈ R

nx . 1

• w denotes the set of input/output variables, which consists of command(discrete input) variables ud ∈ Ud, continuous input/output variables wc ∈R

nw , and Gaussian noise variables vc ∈ Rnv .

1 We let lowercase bold symbols, such as v, denote both the set of variables{v1, . . . , vl} and the vector [v1, . . . , vl]T .

5

Page 6: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

T

ball:

actuator: okfailed

noyes

θ1

θ2

Fig. 1. A two-link acrobatic robot. The robot swings on a bar and may catch aball of known mass, whenever it is on the right side (θ1 > 0.55). The body of therobot can be modeled as a PHA, where the hidden continuous state consists of fourvariables, representing the angles and angular velocities at two joints.

• F : Xd → FDE∪FAE specifies the continuous evolution of the automaton foreach discrete mode, in terms of the set of discrete-time difference equationsFDE and the set of algebraic equations FAE over the variables xc, wc, andvc.

• T : Xd → 2P ∪ C specifies the discrete transition distribution of the automa-ton as a finite set of transition probabilities pτi ∈ P over the target modesXd and their associated guard conditions ci ∈ C over xc ∪ ud. The guardconditions ci form a partition of the space R

nx × Ud.• X0 is a distribution for the initial state of the automaton, with a Gaussian

distribution p(xc,0|xd,0) for the continuous state xc,0, conditioned on eachdiscrete mode xd,0.

The transition function T (d), for some mode d, specifies the transition dis-tribution p(xd,t|xd,t−1 = d,xc,t−1,ud,t). Each tuple 〈pτ , c〉 ∈ T (d) defines thetransition distribution as having a constant value of pτ in the regions sat-isfied by the guard c. The transition function can thus specify conditionaldistributions p(xd,t|xd,t−1 = d,xc,t−1,ud,t) that are piecewise constant in thecontinuous state xc,t−1. While in keeping with previous work we retain thisdefinition for PHA, in Section 4.2 we show that our efficient hybrid estimationmethod can handle piecewise polynomial transition functions.

Most engineered systems consist of several concurrently operating compo-nents. Composition of PHA provides a method for specifying a model for theoverall system, by specifying PHA models for its components and then com-bining these models. Composed automata are connected through shared con-tinuous input/output variables, which corresponds to connecting the system’sphysical components through natural phenomena, such as forces, pressures,and flows.

6

Page 7: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

ball

yes

no

55.01 ≤θ

1.0

55.01 >θ0.99

55.01 >θ 0.99

55.01 ≤θ1.0

0.01

0.01

2121 ,,, ωωθθContinuous dynamics:

2,2,1,2,1,21,2

1,2,1,2,1,11,1

2,2,21,2

1,1,11,1

),,,,(

),,,,(

ω

ω

θ

θ

δωωθθωδωωθθω

δωθθδωθθ

vtTf

vtTf

vt

vt

ttttyest

ttttyest

ttt

ttt

+=

+=++=

++=

+

+

+

+

2,2,1,2,1,21,2

1,2,1,2,1,11,1

2,2,21,2

1,1,11,1

),,,,(

),,,,(

ω

ω

θ

θ

δωωθθωδωωθθω

δωθθδωθθ

vtTf

vtTf

vt

vt

ttttnot

ttttnot

ttt

ttt

+=+=

++=++=

+

+

+

+

Fig. 2. A PHA for the two-link body of the acrobot system. Left: Transition modelfor the discrete state ball of the body, representing whether or not the robot carriesa ball of mass mball. When θ1 > 0.55, there is a probability at each time step of0.01 that the state of ball changes. When θ1 > 0.55, the state remains the same.Right: Evolution of the automaton’s continuous state, one set of equations for eachmode. The continuous dynamics for each mode can be derived using Lagrangianmechanics and turned into a set of discrete-time difference equations, using theEuler approximation.

In order to compose PHA, we combine their hidden state variables and theirdiscrete and continuous evolution functions:

Definition 2 The composition CA of two Probabilistic Hybrid Automata A1

and A2 is defined as a tuple 〈x,w, F, T,X0,Xd,Ud〉, where

x = xd ∪ xc, with xd � xd 1 ∪ xd 2 and xc � xc 1 ∪ xc 2,

w � w1 ∪ w2,

F (xd) � F1(xd 1) ∪ F2(xd 2),

T (xd) � T1(xd 1) × T2(xd 2),

X0(x) = X0 1(x2)X0 2(x2),

Xd � Xd1 ×Xd2, and

Ud � Ud1 × Ud2.

We define the transitions of each component as being independent, conditionedon xc,t−1. The overall continuous evolution of the CPHA is determined bytaking the union of algebraic and difference equations for each componentPHA. The equations F (d) are then solved using Groebner bases [25] andcausal analysis [26] into the standard form

xc,t = f(xc,t−1,uc,t−1,vx,t−1,xd,t)

yc,t =g(yc,t−1,uc,t−1,vy,t−1,xd,t). (1)

7

Page 8: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

Actuator

desiredtorque Robot’s body

exertedtorque true θ

2 Sensor

observedθ

2

2121 ,,, θθθθ &&

health: {ok, failed} ball: {yes, no}

Fig. 3. A composed model for the acrobatic robot in Figure 1. Each component ismodeled with one Probabilistic Hybrid Automaton. The component automata areshown in rectangles, with their state variables shown beneath. Using composition,the PHA for an acrobot body can be augmented using a PHA model of the torqueactuator, and a model of the angular position sensor, measuring θ2. The full discretetransition model is shown in Figure 4. The actuator exerts the commanded torquein the ok mode, but exerts zero torque in the failed mode. The position sensor ismodeled as adding Gaussian white noise to the true value of θ2.

actuator

ok

failed

0.9995

1.0

ball

yes

no

55.01 <θ

1.0

55.01 ≥θ0.99

55.01 ≥θ0.99

55.01 <θ1.0

0.01

0.01

0.0005

Fig. 4. Discrete transition model for the acrobot CPHA. If the actuator has failed,it exerts no torque. When the robot catches the ball, ball=yes, and the mass ofthe lower link increases.

This completes our review of Concurrent Probabilistic Hybrid Automata. Wenow consider the problem of state estimation for these models.

2.2 Hybrid State Estimation

Given a hybrid model of the system, our goal is to estimate its state from asequence of control inputs and observations:

Definition 3 Hybrid State Estimation: Given a CPHA model of the sys-tem CA and the sequence of control inputs u0, . . . ,ut and observed outputsy0, . . . ,yt, at time t determine the hybrid state estimate 〈xd,t,xc,t〉 defined asthe probability p(xd,t,xc,t|y1:t,u0:t).

In general, we can frame the hybrid state estimation problem as that of ap-proximating the posterior distribution 〈xd,t,xc,t〉 and use this distribution to

8

Page 9: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

compute characteristics of interest, such as the MAP mode estimate or con-tinuous state estimate.

2.3 K-best Enumeration

Existing methods for hybrid estimation with CPHA models have used a k-bestenumeration approach[8]. This section outlines the approach.

The desired distribution p(xd,t,xc,t|y1:t,u0:t) can be expressed as a sum ofposterior distributions for all discrete mode trajectories that end in state xd,t:

p(xd,t,xc,t|y1:t,u0:t) =∑

xd,0:t−1

p(xd,0:t,xc,t|y1:t,u0:t). (2)

Each summand can be further expanded as a product of the posterior prob-ability of the discrete mode trajectory xd,1:t and the posterior distribution ofthe continuous state, conditioned on this mode trajectory:

p(xd,0:t,xc,t|y1:t,u0:t) = p(xd,0:t|y1:t,u0:t)p(xc,t|xd,0:t,y1:t,u0:t). (3)

This decomposition leads to a natural representation of the belief state asa mixture of Gaussians, one for each reachable mode trajectory xd,1:t. Givenxd,1:t, the second term can be approximated as a Gaussian, using a combinationof a Kalman Filter and numerical integration techniques, such as GaussianQuadrature and Exact Monomials [11]. The weight of each mixture componentis then computed using the belief state update [18]:

p(xd,0:t|y1:t,u0:t) = b(xd,0:t) ∝ PO · PT · b(xd,0:t−1). (4)

In this equation, PT � p(xd,t|xd,0:t−1,y1:t−1,u0:t) is the prior probability oftransitioning to a state xd,t, given the past mode trajectory and past obser-vations; we refer to this as the transition prior. PO � p(yt|xd,0:t,y1:t−1,u0:t) isthe measurement update. Both PT and PO can be calculated approximately[8].

Naturally, tracking all possible mode sequences is infeasible; the number ofsuch sequences increases exponentially with time. Indeed, inference in proba-bilistic hybrid models, including SLDS, hybrid Bayesian networks, and CPHA,has been shown to be NP-hard [10]. Nevertheless, in many domains, efficientinference is possible by employing two strategies: pruning (branching) andcollapsing (merging) (see Figure 5). Pruning removes some branches from thebelief state, based on the evidence observed so far, while collapsing combinessequences with the same mode at their fringe to a single hypothesis [27,28].

One example of a pruning approach that has been shown to be particularlyeffective in both discrete and hybrid estimation is k-best filtering[8,28,29].K-best filtering methods focus the state estimation on sequences with high

9

Page 10: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

Partially

Open

Partially

Open

Fully

Open

Partially Open

Fully Open

1

2

Partially

Open

Partially

Open

Fully

Open

Partially Open

Fully Open

Stuck Open5

1

2

Fig. 5. Pruning (left) and collapsing strategies (right).

1. Initialization

• create a node s corresponding to each non-zero value of p(xd,0)

• initialize f(s) = −ln(p(xd,0))

• initialize the estimate mean x(s)c,0 ← E[xc,0|x(s)

d,0]

• initialize the estimate covariance P(s)0 ← Cov(xc,0|x(s)

d,0)

• add all nodes x(s)d,0 to priority queue

2. For t = 1, 2, . . .

(a) A* Search Step

• While size(new kbest) < k do

– Remove node s from priority queue with lowest f = g + h: s ← pop from queue()– If s is a leaf node (has full assignment to component modes and PO computed), then

∗ Add s to new kbest

– Else

∗ Expand node s to successors: expanded nodes ← expand to successors(s)∗ Add expanded nodes to priority queue: push onto queue(expanded nodes)

(b) Normalization Step

• Normalize new k-best nodes: normalize(new kbest)

Fig. 6. Hybrid Estimation for CPHA using k-best Enumeration.

posterior probability. Typically, a k-best filter starts with a set of mode se-quences at one time step and expands these sequences to obtain the set ofleading sequences at the next time step. With independence assumptions onthe model, such as those in CPHA, an efficient solution is to frame the ex-pansion as a search and solve it using a combination of branch and boundand A* algorithms [8]. The pseudocode for this approach is shown in Fig-ures 6 and 7. The k-best enumeration approach has been shown empiricallyto be an efficient technique for hybrid state estimation in systems that exhibitautonomous mode transitions, nonlinear dynamics and concurrency [8].

This completes our review of CPHA and k-best enumeration. In Section 3we develop a complementary, stochastic method based on Rao-BlackwellisedParticle Filtering. By combining these methods, making use of the insight fromthe empirical comparison in Section 6, we develop a robust, memory-efficientmethod that balances exploration and exploitation (Section 5).

10

Page 11: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

1. Expand Node s to Successors

• If component number j ≤ N

– Increment component number j: j ← j + 1– For each possible transition of component j to mode xd j,t

∗ Update partial mode assignment with xd j,t

∗ Compute component transition prior: PTj ← p(xd k,t|x(i)d k,0:t−1,y1:t−1,u0:t)

∗ Update node cost: g(s) ← g(s) − ln(PTj)∗ Update heuristic: h(s) ← −∑n

l=j+1 ln(max pτl)– Return successor nodes

• Else

– Perform a Kalman Filter update: x(s)c,t , P

(s)t , r(s)

t ,S(s)t ← UKF (x(s)

c,t−1,P(s)t−1,x

(s)d,t)

– Calculate observation probability: PO ← p(yt|xd,1:t,y1:t−1)– Update node cost: g(s) ← g(s) − ln(PO)– Update heuristic: h(s) ← 0– Return successor nodes

Fig. 7. Node Expansion for k-best Enumeration Algorithm.

3 Rao-Blackwellised Particle Filtering for CPHA

The key contribution of this section is an approximate Rao-Blackwellised par-ticle filtering (RBPF) algorithm for CPHA that handles autonomous modetransitions, that is, those that depend on the continuous state, as well asnonlinear system dynamics and concurrency.

3.1 Overview of Rao-Blackwellised Particle Filtering for CPHA

The new approximate RBPF algorithm is our first step in developing a mixedexploitation/exploration strategy for CPHA, and complements our greedy k-best approach for CPHA. To achieve memory-efficiency, we use particles torepresent Gaussian distributions conditioned on a particular mode sequence,and we develop a Gaussian particle filter for CPHA as an instance of a Rao-Blackwellised Particle Filter[13,22]

Our algorithm is illustrated in Figure 8. In the spirit of prior approaches toRBPF [13,22], our algorithm exploits the structure in the estimation problemand samples only the discrete mode sequences. Conditioned on each sampledsequence, the new algorithm approximates the associated continuous statedistribution as a Gaussian in closed form, using an Extended [19] or an Un-

scented Kalman Filter [20]. Each particle holds a sample trajectory x(i)d,0:t and

the corresponding continuous estimate 〈x(i)c,t,P

(i)t 〉. This is described in detail

in Section 3.3.

The algorithm starts by taking a fixed number of random samples from the

11

Page 12: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

initial distribution over the mode variables p(xd,0) (Step 1). For each sam-

pled mode x(i)d,0, the corresponding initial continuous distribution p(xc,0|x(i)

d,0)is specified by the PHA model. At each time-step, the algorithm then uses themodel to expand the mode sequences of each particle and to update the cor-responding continuous estimates (see Figure 8, Step 2). This is done by first

evolving each particle by taking one random sample x(i)d,t, for each particle, from

a suitably chosen proposal distribution q(xd,t|x(i)d,0:t−1,y1:t,u0:t). Intuitively, the

proposal is a distribution close to the true distribution that we are trying todetermine, p(xd,t|xd,0:t−1 = x

(i)d,0:t−1,y1:t,u0:t); that is, the posterior probability

of a given mode sequence given all of the observations up to time t. The pos-terior distribution is difficult to calculate in closed form, so instead we samplefrom the proposal distribution, which we choose to be easy to calculate. Wethen compensate for the discrepancy between the proposal distribution andthe true distribution by assigning an importance weight w

(i)t for each new

mode sequence x(i)d,0:t. Finally, the resampling step duplicates particles accord-

ing to their weighting, thereby adjusting the number of particles wherever theproposal distribution does not match the desired distribution[12].

In order to instantiate this algorithm we must define the proposal distribu-tion and the importance weight. For the proposal distribution we choose thedistribution p(xd,t|xd,0:t−1 = x

(i)d,0:t−1,y1:t−1,u0:t), which is the transition prior

mentioned in Section 2.3. We show in Section 3.4 that this distribution can becalculated efficiently, even when the model includes autonomous mode tran-sitions. The importance weights then correct for the discrepancy between theproposal and the posterior distribution by taking into account the latest obser-vation yt. In Section 3.6 we show that this can be computed using the KalmanFilter innovation.

Finally, in Section 3.7 the algorithm outlined in Figure 8 is extended to dealwith Concurrent Probabilistic Hybrid Automata.

3.2 Rao-Blackwellised Particle Filtering

In this section we summarize briefly Particle Filtering for hybrid state estima-tion, and review Rao-Blackwellised Particle Filtering.

Particle filters approximate the posterior distribution for the hybrid state xt

with a set of sampled sequences {x(i)0:t}. These samples are evolved sequentially

and approximate the posterior distribution p(xt|y0:t,u0:t) as the probabilitydensity function

pN(xt) =1

N

N∑i=1

δ(x − x(i)t ). (5)

12

Page 13: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

1. Initialization:

)1(

0α(1)

(2) )2(

2. for t = 1, 2, ...

i) sample the mode sequencesacc. to proposal distribution q

)2(

(1)

(2)

)3(

ok

failed

ok

failed

failed

ok

ok

(3)

)1(

failed

iii) compute the importance weights

)1(

tw

)2(

tw

)3(

tw

iv) resample the mode sequences acc. to importance weights

(1)ok

failed

ok

(3)

failed

(2)ok

ok

ii) update continuous estimates with Kalman Filter

)2(

1−tα

)3(

1−tα

)1(

1−tα ty

ty

ty

ty

q

q

q

q

)(itw

Fig. 8. Rao-Blackwellised Particle Filter for PHA.

{x0(i)}

{x1(i)}~

{w1(i)}

{x1(i)}

(1) Initialization

(2) Importance sampling

(3) Selection

mok

p(xc,0,xd,0)

mfailed

p(y1| x1(i))

Fig. 9. The three steps of a simple particle filter for a PHA model with one discreteand one continuous variable.

In the simplest solutions, the samples are taken from the complete hybridstate Xd × R

nx , and are evolved in three steps, as illustrated in Figure 9.In the first, initialization step, the algorithm samples the initial distributionp(x0); thus, effectively approximating the posterior at t = 0. Then, in each

iteration, the particles x(i)0:t are evolved by taking one random sample x

(i)t from

an appropriately-chosen proposal distribution, and by assigning importanceweights that account for the differences between the proposal and the posteriordistributions. The final step selects a number of off-spring for each particleaccording to its weight, thus duplicating the “good” ones and removing the

13

Page 14: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

“bad” ones.

In practice, sampling in high-dimensional spaces can be inefficient, since manyparticles may be needed to cover the probability space and attain a sufficientlyaccurate estimate. Several methods have been developed to reduce the vari-ance of the estimates, including decomposition [30] and abstraction [31]. Oneparticularly effective method is Rao-Blackwellised Particle Filtering [12,22,32].This method is based on a fundamental observation that with some estima-tion problems, a particular part of the desired distribution can be determinedefficiently without using a sampling approach. By factoring out this portion,we obtain a more efficient approach that only samples the remaining variables.

Formally, if we partition the state variables into two sets, r and s, we can usethe chain rule to express the posterior distribution p(x0:t|y1:t,u0:t) as:

p(x0:t|y1:t,u0:t) = p(s0:t, r0:t|y1:t,u0:t)

= p(s0:t|r0:t,y1:t,u0:t)p(r0:t|y1:t,u0:t). (6)

Thus, we expand the posterior in terms of the sequence of random variablesr0:t and in terms of the sequence s0:t conditioned on r0:t. The key to thisformulation is that if we can compute analytically the conditional distributionp(s0:t|r0:t,y1:t,u0:t) or its marginal p(st|r0:t,y1:t,u0:t), then we only need tosample the sequences of variables r0:t, not both s0:t and r0:t. Intuitively, farfewer particles will be needed in this way to reach a given precision of theestimate, since for each sampled sequence r0:t, the corresponding state spaces is covered by an analytical solution, rather than a finite number of samples.

In Rao-Blackwellised particle filtering (RBPF), each particle holds not only

the samples r(i)0:t, but also a parametric representation of the distribution

p(st|r(i)0:t,y1:t) for each sample i, which we denote by α

(i)t . This representa-

tion holds sufficient statistics for p(st|r(i)0:t,y1:t), such as the mean vector and

the covariance matrix of a Gaussian distribution 2 . The posterior is thus ap-proximated as a mixture of the distributions α

(i)t at the sampled points r

(i)0:t:

p(s0:t, r0:t|y1:t,u0:t) ≈N∑1

αt(i)δ(r0:t − r(i)0:t). (7)

A generic RBPF method is outlined in Figure 10, and, except for the initial-ization and the addition of the exact step, it is identical to the particle filter,illustrated in Figure 9. Under weak assumptions, the Rao-Blackwellised esti-mate converges to the estimated value as N → +∞, with a variance smaller

2 Certain distributions can be encoded compactly in terms of a sufficient statistic.Given this statistic, the distribution is defined completely.

14

Page 15: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

1. Initialization

• For i = 1, . . . , N draw a random sample r(i)0 from the prior distribution p(r0) and let α

(i)0 ← p(s0|r(i)

0 )

2. For t = 1, 2, . . .

(a) Importance sampling step

• For i = 1, . . . , N

– draw a random sample r(i)t from the proposal q(rt|r(i)

0:t−1,y0:t,u0:t)

– let r(i)0:t ← (r(i)

0:t−1, r(i)t )

• For i = 1, . . . , N , compute the importance weights: w(i)t ← p(yt|r(i)

0:t,y0:t−1,u0:t)p(r(i)t |r(i)

0:t−1,y0:t−1,u0:t)

q(r(i)t |r(i)

0:t−1,y0:t,u0:t)

• For i = 1, . . . , N normalize the importance weights w(i)t

(b) Exact step

• Update α(i)t given α

(i)t−1, r

(i)t , r

(i)t−1, yt, ut−1, and ut with a domain-specific procedure (such as a Kalman Filter)

(c) Selection step

• Select N particles (with replacement) from {r(i)0:t} according to the normalized weights {w(i)

t } to obtain samples {r(i)0:t}

Fig. 10. Generic RBPF algorithm. [33]

than the non-Rao-Blackwellised particle filtering method with the same num-ber of particles[15]. Hence the RBPF is superior given a fixed number ofparticles; however, the run-time performance of the filter will depend on thecost of the exact update for α

(i)t .

Having reviewed Rao-Blackwellisation, we now describe how we apply thisconcept to PHA.

3.3 Rao-Blackwellisation for Hybrid Estimation with PHA

When estimating the hybrid state of Probabilistic Hybrid Automata, theposterior distribution over the continuous state, p(xc,t|xd,0:t,u0:t), can be ap-proximated efficiently in an analytical form using an Extended or UnscentedKalman Filter. We can, therefore, apply Rao-Blackwellisation to the hybridestimation problem for PHA by taking r = xd and s = xc.

We sample the mode sequences x(i)d,0:t with a particle filter and, for each sampled

sequence x(i)d,0:t, we estimate the continuous state with a Kalman Filter. The re-

sult of Kalman Filtering for each sampled sequence x(i)d,0:t is the estimated mean

x(i)c,t and the error covariance matrix P

(i)t . The samples x

(i)d,0:t serve as an approxi-

mation of the posterior distribution over the mode sequences, p(xd,0:t|y1:t,u0:t),

while each continuous estimate 〈x(i)c,t,P

(i)t 〉 serves as a Gaussian approximation

of the conditional distribution p(xc,t|xd,0:t = x(i)d,0:t,y1:t,u0:t) � αi

t. Since the

estimate 〈x(i)c,t,P

(i)t 〉 merely approximates αi

t, we are not performing a strictRao-Blackwellisation; nevertheless, the distribution will be accurate up to theapproximations in the Extended or the Unscented Kalman Filter.

The continuous estimate for each new mode sequence x(i)d,0:t is updated as shown

in Figure 8ii). Since in a PHA, each mode assignment d over the variables xd

15

Page 16: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

is associated with transition and observation functions:

xc,t = f(xc,t−1,ut−1,d) + vx(d)

yc,t =g(xc,t,ut,d) + vy(d), (8)

we update each estimate x(i)c,t−1, P

(i)t−1 with a Kalman Filter, using the tran-

sition function f(xc,t−1,ut−1,d), observation function g(xc,t,ut,d), and noise

variables vx(x(i)d,t) and vy(x

(i)d,t), to obtain a new estimate 〈x(i)

c,t, P(i)t 〉.

Hence Rao-Blackwellisation can be applied to hybrid estimation with PHA.We now describe the rest of the algorithm in detail.

3.4 Proposal distribution

In particle filtering, the proposal distribution is chosen to be one that is closeto the desired distribution, but that can also be calculated efficiently in closedform. In this section we specify the proposal distribution q(xd,t|x(i)

d,0:t−1,y1:t,u0:t)and show how it is calculated efficiently, while taking into account autonomousmode transitions.

An autonomous mode transition is a form of guarded transition for which thetransition distribution p(xd,t|xd,t−1,xc,t−1,ut−1) depends explicitly on the con-tinuous state xc,t−1. In PHA, the transition distribution is specified as a finiteset of guard conditions c and their associated transition probabilities pτ . Eachguard condition specifies a region over the continuous state and automaton’sinput/output variables, for which p(xd,t|xd,t−1,xc,t−1,ut−1) = pτ . Hence thetransition distribution is piecewise constant over xc,t−1 (see Figure 12).

We choose the proposal distribution to be p(xd,t|xd,0:t−1 = x(i)d,0:t−1,y1:t−1,u0:t).

This distribution expresses the probability of the transition from the modex

(i)d,0:t−1 to each mode xd,t ∈ Xd and is similar in its form to the transition

distribution p(xt|xt−1) in a Markov process. However, it is conditioned on acomplete discrete state sequence and all previous observations and controlactions, rather than simply on the previous state. This is because {xd,t} aloneis not an HMM process: due to the autonomous transitions, knowing onlyxd,t−1 does not tell us what the distribution of xd,t is. The distribution of xd,t

is known only when conditioned on the mode and the continuous state forthe previous time step (see Figure 11). Since the continuous state must beestimated, rather than being observed directly, autonomous transitions makecalculation of the proposal challenging.

We are able to calculate the proposal distribution efficiently for each trackedmode sequence x

(i)d,0:t−1 by using the continuous estimate 〈x(i)

c,t−1,P(i)t−1〉 as fol-

16

Page 17: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

xd,t-1xd,t

xc,t-1 xc,t

yc,t

Fig. 11. Conditional dependencies in PHA among the state variables xc,xd and theoutput y, expressed as a dynamic Bayesian network [34]. The edge from xc,t−1 toxd,t represents the dependence of xd,t on xc,t−1, that is, autonomous transitions.

lows: From the total probability theorem, the proposal distribution is equalto the joint distribution of xd,t and xc,t−1, marginalized over xc,t−1. The jointdistribution can then be expressed in terms of the discrete transition proba-bility conditioned on the previous state, and the continuous state distributionconditioned on the i-th sequence, p(xc,t−1|x(i)

d,0:t−1,y1:t−1,u0:t) � αit−1:

p(xd,t|x(i)d,0:t−1,y1:t−1,u0:t)

=∫xc,t−1

p(xd,t,xc,t−1|x(i)d,0:t−1,y1:t−1,u0:t) dxc,t−1

=∫xc,t−1

p(xd,t|x(i)d,0:t−1,y1:t−1,xc,t−1,u0:t)p(xc,t−1|x(i)

d,0:t−1,y1:t−1,u0:t) dxc,t−1

=∫xc,t−1

p(xd,t|x(i)d,t−1,xc,t−1,ut−1)p(xc,t−1|x(i)

d,0:t−1,y1:t−1,u0:t−1) dxc,t−1 (9)

Here, the third equality comes from the independence assumptions made inthe model; the distribution of xd,t is independent of the observations y1:t−1

and mode assignments prior to time t − 1, given the state at time t − 1.

The proposal distribution can therefore be expressed as an integral over twoknown quantities; first, the transition distribution p(xd,t|x(i)

d,t−1,xc,t−1,ut−1),which is expressed in the PHA model; and second, the continuous state distri-bution αi

t−1. The latter is approximated by the continuous estimate 〈x(i)c,t−1,P

(i)t−1〉.

Typically, when performing Rao-Blackwellisation, the integral in (9) is difficultto evaluate efficiently[33]. In this section we show that for PHA, however,efficient evaluation of this integral is possible.

For the piecewise constant transition distribution in PHA, the left term inthe integral in (9), p(xd,t|x(i)

d,t−1,xc,t−1,ut−1) takes on only a finite number ofvalues pτj(xd,t). Hence we can split the integral domain into the sets Xj thatsatisfy the guard condition cj and factor out the transition probability pτj:

17

Page 18: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

1,1 −tθ

],|Pr[ 1,11 −− == ttt noballyesball θ

01.0

55.0

Fig. 12. Probability of a mode transition ball=no to ball=yes as a function ofθ1,t−1.∫

xc,t−1

p(xd,t|x(i)d,t−1,xc,t−1,ut−1)p(xc,t−1|x(i)

d,0:t−1,y1:t−1,u0:t−1) dxc,t−1

=∑j

∫Xj

p(xd,t|x(i)d,t−1,xc,t−1,ut−1)p(xc,t−1|x(i)

d,0:t−1,y1:t−1,u0:t−1) dxc,t−1

=∑j

pτj(xd,t)∫

Xj

p(xc,t−1|x(i)d,0:t−1,y1:t−1,u0:t−1) dxc,t−1

=∑j

pτj(xd,t) Prα

(i)t−1

[Xj]. (10)

Here, Prα

(i)t−1

[Xj] is the probability that guard condition cj is satisfied.

The second equality holds because, for the region Xj, the conditional distri-

bution p(xd,t|x(i)d,t−1,xc,t−1,ut−1) is fixed and equal to pτj. Therefore, in each

summed term, we multiply the transition distribution pτj by the probability

of satisfying the guard condition cj in the distribution α(i)t−1. Hence, the key

contribution for PHA is that, given the probability of satisfying each guardcondition cj, the proposal distribution for each sample i is calculated by sum-ming over all of the possible guard conditions.

3.5 Evaluating the probability of satisfying transition guards

Given the derivation in the previous section, the remaining challenge in com-puting the proposal distribution is to evaluate the probability of satisfying theguard condition cj, given the distribution α

(i)t−1 = p(xc,t−1|x(i)

d,0:t−1,y1:t−1,u0:t−1).For the following derivation, we assume without loss of generality, that theguard conditions are only over the continuous state 3 .

As described in Section 3.3, the posterior distribution α(i)t−1 is approximated

3 Guard conditions involving discrete or continuous input variables, such as cc(xc)∧cd(ud), are handled by setting Pr

α(i)t−1

[Xj ] ≡ 0 whenever cd(ud) is not satisfied. More

complex guards are transformed to a number of simpler guards using elementaryrules of logic.

18

Page 19: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

1θ55.0

Fig. 13. Evaluating single-variate guard conditions.

using a Gaussian distribution with mean x(i)c,t−1 and covariance P

(i)t−1. The prob-

ability of xc,t being in the guard region Xj is then simply an integral over aknown Gaussian distribution:

Prα

(i)t

[Xj] ≈ 1

(2π)nc/2|P(i)t−1|1/2

∫Xj

e−12(xc−x

(i)c,t−1)

T P(i)t−1

−1(xc−x

(i)c,t−1) dxc (11)

When the guard conditions are of the form x < c or x ≤ c, for some constantc, such as θ1 < 0.55, the integral in (11) simplifies to evaluating the cumulative

density function D(c) of the normal variable N (µ, σ2), where µ = (x(i)c,t−1)x is

the mean of variable x in x(i)c,t−1 and σ2 = (P

(i)t )x is its variance (Figure 13):

D(c) � 1

σ√

∫ c

−∞e−(x−µ)2/(2σ2)dx. (12)

The cumulative density function D(c) is evaluated using standard numericalmethods, such as a trapezoidal approximation or using a table lookup. In orderto evaluate the probability of the complementary guards x > c or x ≥ c, wetake the complement of the cumulative density function, 1 − D(c).

The above forms of guard conditions can be viewed as a special case of a moregeneral form, in which x falls into an interval [l, u] 4 , where l, u are in theextended set of real numbers R

+ � R∪{−∞, +∞} that includes positive andnegative infinity. In these cases, the probability of satisfying a guard conditioncan be expressed as the difference of the c.d.f at the endpoints of the interval,D(u) − D(l).

We have previously used this technique with our k-best enumeration approachfor CPHA[5]. Using this approach for RBPF, we are able to calculate the pro-posal distribution in Section 3.4 efficiently for a particular class of autonomousmode transitions; those whose guard conditions are singlevariate intervals. InSection 4.1 we generalize this method to apply to CPHA with multivariatelinear guard conditions.

4 Whether the interval is closed or open matters only if x can have a zero variance. Itis straightforward to generalize the discussion here to open and half-open intervals.

19

Page 20: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

Within the CPHA modeling formalism, however, the transition distribution,that is, p(xd,t|x(i)

d,t−1,xc,t−1,ut−1), is still constrained to be piecewise constantin xc,t−1. We present results that show that this contraint can be relaxed.In Section 4.2 we generalize the approach in this section to calculate thetransition prior p(xd,t|x(i)

d,0:t−1,y1:t−1,u0:t) for arbitrary polynomial transitiondistributions. This transition prior is used in the Rao-Blackwellised ParticleFilter, as the proposal distribution, and in k-best enumeration. Hence thiscontribution expands the class of autonomous mode transitions that can behandled by both approaches to hybrid estimation.

3.6 Importance weights

In this section we describe how, given our choice of proposal distribution, theimportance weight can be calculated. The importance weight compensates forthe discrepancy between the proposal distribution, which we chose to be thetransition prior p(xd,t|xd,0:t−1 = x

(i)d,0:t−1,y1:t−1,u0:t), and the desired distri-

bution, p(xd,t|xd,0:t−1 = x(i)d,0:t−1,y1:t,u0:t). The importance weight determines

how many duplicates of a given particle are generated, thereby adjusting thenumber of particles for which the proposal distribution did not match the de-sired distribution. In our case, the importance weight incorporates the latestobservation in order to update the prior distribution, to give the posteriordistribution.

Given our choice of proposal distribution, the weights w(i)t simplify to

w(i)t �

p(yt|x(i)d,0:t,y0:t−1)p(x

(i)d,t|x(i)

d,0:t−1,y0:t−1)

q(x(i)d,t,x

(i)d,0:t−1,y0:t)

= p(yt|x(i)d,0:t,y0:t−1,u0:t) (13)

This expression represents the likelihood of the observation yt, given a com-plete mode sequence x

(i)d,0:t, inputs u0:t, and previous observations y0:t−1. PHA,

like most hybrid models, do not directly provide this likelihood and only pro-vide the probability of an observation y, conditioned on the discrete andcontinuous state. This likelihood is approximated using the Kalman Filterinnovation at time t as follows.

In the Kalman Filter predict/measurement cycle, the Gaussian distribution

α(i)t−1 = N (x

(i)c,t−1,P

(i)t−1) is propagated through the continuous transition and

observation functions for mode x(i)d,t (8). For SLDS models, this gives a Gaussian

distribution for the observed value yt, with mean yp and covariance S(i)t . The

Kalman Filter innovation, r = yt − yp, is defined as the difference betweenthe expected observation and the actual observation. For SLDS models, the

20

Page 21: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

observation likelihood in (13) is calculated exactly as[27]:

w(i)t =

1

(2π)N/2|S(i)t |1/2

e−0.5rT (S(i)t )−1r. (14)

A similar approach leads to an efficient approximation of the weight in the casewhen the model contains nonlinear dynamics and autonomous transitions. Dueto the nonlinearities and autonomous transitions, the conditional distributionα

(i)t−1 = p(xc,t−1|x(i)

d,0:t,y0:t−1,u0:t) is no longer strictly Gaussian. Nevertheless,

if we approximate it with the estimated Gaussian N (x(i)c,t−1,P

(i)t−1), as we have

done in the previous subsections, we can compute the weight in (13) fromthe Extended[19] or the Unscented[20] Kalman Filter measurement update.For example, with an Extended Kalman Filter, the observation likelihoodis computed by first propagating the Gaussian distribution N (x

(i)c,t−1,P

(i)t−1)

through the system model in mode x(i)d,t:

x(i−)c,t = f(x

(i)c,t−1,ut−1) (15)

A=∂f

∂xc

|x(i)c,t (16)

P(i−)t =AP

(i−)t AT + Q, (17)

where Q = cov(vx(x(i)d,t)) is the system noise in mode x

(i)d,t. This leads to the

observation prediction yp with covariance S(i)t :

yp =g(x(i−)c,t ,ut) (18)

C=∂g

∂xc

|x(i−)c,t (19)

S(i)t =CP

(i−)t CT + R, (20)

where R = cov(vy(x(i)d,t)) is the observation noise in mode x

(i)d,t. The observation

likelihood in (13) can then be approximated with normal p.d.f.

w(i)t =

1

(2π)N/2|S(i)t |1/2

e−0.5rT (S(i)t )−1r. (21)

We have therefore presented an novel, approximate Rao-Blackwellised par-ticle filtering algorithm for Probabilistic Hybrid Automata. This is able tohandle autonomous mode transitions and nonlinear dynamics. in Section 3.7we extend this method to handle concurrent Probabilistic Hybrid Automata(CPHA).

21

Page 22: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

3.7 Rao-Blackwellised Particle Filtering for CPHA

In practice, a model will be composed of several concurrently operating au-tomata that represent individual components of the underlying system. In thismanner, the design of the models can be split on a component-by-componentbasis, thus enhancing the reusabiliy of the models and reducing modeling costs.In this section we extend our Rao-Blackwellised Particle Filter for PHA, devel-oped in the previous section, to handle concurrent PHA models; see Section 2.1for an overview of CPHA.

In CPHA, component transitions are conditionally independent, given thecurrent discrete and continuous state. Therefore, it is possible to compute thetransition probabilities P

(i)T for each tracked mode sequence component-wise

[5,35]. This property is exploited by our algorithm in the importance samplingstep, whereby the samples are evolved according to the transition distributionPT on a component-by-component basis.

The algorithm in Section 3 sampled the mode sequences according to theproposal distribution q(xd,t|x(i)

d,0:t−1,y1:t,u0:t) = p(xd,t|x(i)d,0:t−1,y1:t−1,u0:t) �

P(i)T ,t. This represents the prior probability of being in the mode xd,t at time t,

conditioned on the previous sequence of modes x(i)d,0:t−1 and observations y1:t−1,

leading to that mode. Given this choice of the proposal, the importance weightssimplify to:

w(i)t = p(yt|x(i)

d,0:t,y1:t−1,u0:t) � P(i)O,t. (22)

When sampling mode sequences in CPHA, we use the same proposal distri-bution. The only difference is that now, instead of computing the transitionprobability for every value in the domain Xd of the discrete variables xd, weevaluate it only for the individual component’s discrete domain Xd j, and ob-

tain the joint transition distribution p(xd,t|x(i)d,0:t−1,y1:t−1,u0:t) as a product

of component transition distributions∏

j p(xd j,t|xd j,0:t−1,y1:t−1,u0:t), for allcomponents j in the model (see Section 3.4).

The final algorithm is shown in Figure 14 5 . To summarize, using a Rao-Blackwellised Particle Filtering approach, this algorithm is able to estimate ef-ficiently the hybrid state of Concurrent Probabilistic Hybrid Automata, whichhave autonomous mode transitions, nonlinear dynamics, and many concur-rently operating components.

5 Note that, compared to the generic RBPF algorithm in Figure 10, the importanceweight is now calculated after the exact step, because the innovation mean andcovariance, computed in the Kalman Filter update step, are used to compute theimportance weight.

22

Page 23: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

1. Initialization

• For i = 1, . . . , N

– draw a random sample x(i)d,0 from the prior distribution p(xd,0)

– initialize the estimate mean x(i)c,0 ← E[xc,0|x(i)

d,0]

– initialize the estimate covariance P(i)0 ← Cov(xc,0|x(i)

d,0)

2. For t = 1, 2, . . .

(a) Importance sampling step

• For i = 1, . . . , N– For each component k

∗ compute the transition distribution p(xd k,t|x(i)d k,0:t−1,y1:t−1,u0:t)

∗ sample x(i)d,t k ∼ p(xd k,t|x(i)

d k,0:t−1,y1:t−1,u0:t)

– let x(i)d,0:t ← (x(i)

d,0:t−1, 〈x(i)d,t 1, . . . ,x

(i)d,t nc

〉)(b) Exact step

• For i = 1, . . . , N

– perform a KF update: x(i)c,t,P

(i)t , r(i)

t ,S(i)t ← UKF (x(i)

c,t−1,P(i)t−1,x

(i)d,t)

– compute the importance weight: w(i)t ← N (r(i)

t ,S(i)t )

(c) Selection step

• normalize the importance weights w(i)t

• Select N particles (with replacement) from {〈x(i)d,0:t, x

(i)c,t,P

(i)t 〉} according to the normalized

weights {w(i)t } to obtain particles {〈x(i)

d,0:t, x(i)c,t,P

(i)t 〉}

Fig. 14. Rao-Blackwellised Particle Filter for CPHA.

4 Generalizing Autonomous Mode Transitions

In this section we extend the class of autonomous mode transitions that canbe handled by both k-best enumeration and Rao-Blackwellised Particle Fil-tering for CPHA. We describe the generalization to multivariable linear tran-sition guards, in the case of piecewise constant transition distributions, andto polynomial transition distributions of arbitrary order, in the case of singlevariables.

4.1 Multi-variate guard conditions

The key challenge in handling autonomous mode transitions is to compute thetransition prior p(xd,t|x(i)

d,0:t−1,y1:t−1,u0:t) efficiently.

In the case of CPHA, this probability can be expressed as a sum over a finitenumber of terms, where each term is a product of the probability Pr

α(i)t−1

[Xj] of

a guard condition cj being satisfied, and the corresponding transition probabil-ity pτj (10). The remaining challenge is to calculate Pr

α(i)t−1

[Xj]. Our previous

work showed how this can be carried out efficiently for guard conditions thatare single-variate intervals[8,18]. In this section we show that this can be gen-eralized to rectangular and linear multivariate guard conditions.

23

Page 24: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

1h

2h

]0,0[

),,|,(

),ˆ,ˆ(

1:01:1

)(

1:0,21

21

−−−

ttitdhhp

Phh

uyx

N

Fig. 15. Rectangular integral over a Gaussian approximation of the posterior densityof h1 and h2, p(h1, h2|x(i)

d,0:t−1,y1:t−1,u0:t−1).

4.1.1 Rectangular Multivariate Guard Conditions

Multivariate guard conditions are often needed to represent more complexconstraints on transitions than can be handled by single-variate interval con-ditions. For example, consider a system with two tanks, connected by a pipeat height h. The mode transitions for the flow between the two tanks are con-strained by how the fluid heights in the two tanks, h1 and h2, compare to h.In this system, the transition into the no-flow mode would be conditioned onthe guard condition (h1 < h) ∧ (h2 < h).

In general, the rectangular multi-variate guard conditions will take the form∧i∈I(xi ∈ [li, ui]), where xi are distinct continuous state variables and I �

{i1, . . . , in} are their indices. Evaluating the probability of such a multi-variateguard condition amounts to evaluating the multi-dimensional (hyper)rectangularintegral over a Gaussian distribution (see Figure 15):

Prα

(i)t

[Xj] ≈ 1

(2π)nc/2|PI |1/2

∫ ui1

li1

∫ ui2

li2

· · ·∫ uin

lin

e−12(xc−xc,I)T PI

−1(xc−xc,I) dxc,

(23)where xc,I is the mean of guard values, selected from the continuous state esti-

mate x(i)c,t−1, and PI is the covariance matrix of guard values, selected from the

estimate covariance P(i)t−1. Rectangular integrals over Gaussian distributions

are evaluated efficiently using numerical methods, such as those presented in[36,37]. As an alternative, one could use Monte Carlo methods to evaluate theintegral (23). Numerical methods tend to perform better, hence this is theapproach used in our system.

24

Page 25: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

1h

2h

]0,0[

),,|,(

),ˆ,ˆ(

1:01:1

)(

1:0,21

21

−−−

ttitdhhp

Phh

uyx

N

Fig. 16. Linear guard condition h2 < h1 over the Gaussian approximation of theposterior density of h1 and h2.

4.1.2 Linear multi-variate guards

Sometimes, transition guards are best represented by a linear combinations ofcontinuous variables. For example, in a two-tank system, the direction of theflow between the two tanks depends on the heights in the two tanks. Hence,the mode variable for the flow direction would be guarded by the linear guardsh1 − h2 > 0 and h1 − h2 < 0 (see Figure 16). While it would be possible toinclude h1−h2 as a derived state variable in the model, doing so would increasethe computational complexity of the Kalman Filter update by one dimension,and would make the covariance matrix singular. A singular covariance matrixprevents the Kalman Filter update essential to the Rao-Blackwellised Particlefrom being carried out. Instead, we apply a linear transform to the Gaussiandistribution, thus reducing the computation to an instance of the rectangularmultivariate integral performed in Section 4.1.1.

Suppose that the guard condition c is expressed as a conjunction of clauses∧ni=1 li < aixc < ui, where ai is the vector of guard coefficients that spec-

ify condition ci. Such guard conditions correspond to a convex space thatis formed as an intersection of hyper-planes li < aixc and aixc < ui. Let

A �[a1 a2 . . . an

]T

be the matrix of the guard coefficients and define z �Axc,t−1 as the derived vector with n elements. Then the guard conditionc � ∧n

i=1 li < aixc < ui is equivalent to the guard condition∧n

i=1 li < zi < ui.The probability of the guard condition c being satisfied can thus be evaluatedas an integral

∫ ui1

li1

∫ ui2

li2

· · ·∫ uin

lin

p(zt|x(i)d,0:t−1,y1:t−1,u0:t−1) dxc, (24)

over the rectangular region [l1, u1] × [l2, u2] × · · · × [ln, un].

25

Page 26: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

0.01

1

Tsafe

Tcrit

Component Temperature

Fig. 17. Probability of transition into failed mode for a simple component as afunction of temperature. Below the safe temperature, Tsafe, there is a small failureprobability. Above this, the failure probability increases linearly until Tcrit, at whichpoint failure is guaranteed.

For this general case, the posterior distribution of z is as intractable as theposterior distribution α

(i)t−1 of xc,t−1, thereby defeating the purpose of the

proposal distribution. However, since we approximate α(i)t−1 with a Gaussian

N (x(i)c,t−1,P

(i)t−1), the distribution of z is Gaussian, with a mean Ax

(i)c,t−1 and a

covariance AP(i)t−1A

T . Therefore, the key result is that the linear guard con-ditions can once again be evaluated as a rectangular integral over a Gaussiandistribution.

4.2 Polynomial Transition Distributions

The PHA formalism specifies a finite set of guarded transitions between dis-crete modes, each with a constant transition probability, given that the guardcondition is satisfied. Hence the transition distribution p(xd,t|x(i)

d,t−1,xc,t−1,ut−1)is piecewise constant in xc,t−1 (Figure 12). We now present a method for calcu-

lating p(xd,t|x(i)d,0:t−1,y1:t−1,u0:t), i.e. the transition prior, when the transition

distribution p(xd,t|x(i)d,t−1,xc,t−1,ut−1) is not piecewise constant. In particular,

we show that the integral in (9) can be calculated efficiently when the transi-tion distribution is described by a piecewise polynomial function of arbitraryorder, for transition distributions defined over a single variable. An exam-ple of a piecewise polynomial transition distribution is shown in Figure 17.This contribution greatly expands the class of models about which our hy-brid estimation approaches can reason, since piecewise polynomial functionsof arbitrary order can approximate any piecewise smooth function to arbitraryaccuracy.

For a transition distribution defined over a single variable x in xc,t−1, meaning

that, p(xd,t|x(i)d,t−1,xc,t−1,ut−1) = p(xd,t|x(i)

d,t−1, x,ut−1) the transition prior canbe written as:

26

Page 27: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

p(xd,t|x(i)d,0:t−1,y1:t−1,u0:t)

=∫

xp(xd,t|x(i)

d,t−1, x,ut−1)p(x|x(i)d,0:t−1,y1:t−1,u0:t−1) dx. (25)

As in Section 3.5 we approximate p(x|x(i)d,0:t−1,y1:t−1,u0:t−1) using a Gaussian

distribution with mean µ and covariance σ. As before, we assume the cumu-lative distribution function D(c), given by (12), can be calculated efficiently.

If p(xd,t|x(i)d,t−1, x,ut−1) is piecewise constant in x, the transition prior is eval-

uated using the method described in Section 3.4. This requires the evaluationof D(c) at the boundary values of every guard condition. Consider now thecase where the transition distribution is piecewise linear in x:

p(xd,t|x(i)d,t−1, x,ut−1) = a0,j + a1,jx

x ∈ Xj (26)

When calculating the transition prior in (25), the integral can be split into afinite number of integrals of the following form:

∫xp(xd,t|x(i)

d,t−1, x,ut−1)p(x|x(i)d,0:t−1,y1:t−1,u0:t−1) dx

=1

σ√

∑j

∫Xj

(a0,j + a1,j)xe−(x−µ)2/(2σ2) dx. (27)

The integral for each guard condition can be rewritten as:

∫Xj

a1,j(x − µ)e−(x−µ)2/(2σ2) dx +∫

Xj

(a0,j + a1,jµ)e−(x−µ)2/(2σ2) dx. (28)

We now use the result that the integral

∫ u

l(x − µ)e−(x−µ)2/(2σ2) dx =

[−σ2e−(x−µ)2/2σ2

]u

l, (29)

can be calculated in closed form using a substitution of the form v = (x−µ)2.Assuming the guard region Xj takes the form x ∈ [l, u] this means the firstterm in (28) can be calculated in closed form. The second term in (28) is aninterval integral over the Gaussian p.d.f. for x, and this is evaluated using thecumulative distribution function D(c), as in (12). Hence for linear transitionfunctions, the transition prior can be calculated efficiently.

We now generalize this result to piecewise polynomial transition functions ofthe form:

27

Page 28: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

p(xd,t|x(i)d,t−1, x,ut−1) =

np∑i=0

ai,jxi

x ∈ Xj. (30)

The transition prior in (25) can now be decomposed further, according to eachterm of the polynomial:

∫xp(xd,t|x(i)

d,t−1, x,ut−1)p(x|x(i)d,0:t−1,y1:t−1,u0:t−1) dx

=∑j

∑i

∫Xj

ai,jxip(x|x(i)

d,0:t−1,y1:t−1,u0:t−1) dx. (31)

The polynomial in (30) can be rewritten as:

np∑i=0

ai,jxi =

np∑i=0

a′i,j(x − µ)i, (32)

and correspondingly the transition prior can be written as:

∫xp(xd,t|x(i)

d,t−1, x,ut−1)p(x|x(i)d,0:t−1,y1:t−1,u0:t−1) dx

=∑j

∑i

1

σ√

∫Xj

a′i,j(x − µ)ie−(x−µ)2/(2σ2) dx. (33)

By repeated integration by parts, the more general form of the integral in(29),

∫ u

l(x − µ)ie−(x−µ)2/2σ2

dx

=[−σ2(x − µ)i−1e−(x−µ)2/2σ2

]u

l+

∫ u

l(i − 1)(x − µ)i−2σ2e−(x−µ)2/2σ2

dx,

(34)

can be reduced to a summation of closed form terms, plus either the integralin (29), in the case of odd i, or the integral in (12), in the case of even i. In thefirst case the integral is evaluated entirely in closed form, while in the secondcase the only non-closed form term is the integral over the p.d.f., which isevaluated using D(c). Furthermore, all of the terms that cannot be evaluatedin closed form are scaled versions of the same integral over the p.d.f.; hence,D(c) needs only to be evaluated at the boundaries of the guard regions, aswas the case for the piecewise constant transition function.

Thus, for transition distributions over single variables, the transition prioris calculated efficiently for transition distributions that are piecewise polyno-mial. This contribution greatly expands the class of models about which both

28

Page 29: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

Rao-Blackwellised particle filtering and existing k-best methods for CPHA canreason, since piecewise polynomial functions of arbitrary order can approxi-mate any piecewise smooth function to arbitrary accuracy.

5 Hybrid Estimation using a Mixed Stochastic/Greedy Method

5.1 A Unified Treatment of k-best Enumeration and RBPF

Our objective, as stated in the introduction, is a robust, memory efficientmethod for Gaussian filtering of CPHA. Robustness is achieved by balancingexploration with exploitation, while memory efficiency is achieved by using amixture-of-Gaussians representation. To this end, Section 2.3 described ourprevious method for Hybrid Estimation with CPHA based on greedy suc-cessor enumeration. Section 3 described a new exploration method based onRao-Blackwellised Particle Filtering. These methods lend themselves to unifi-cation, in that they represent the belief state as a mixture of Gaussians, witheach Gaussian representing a mode trajectory xd,0:t, and in both methods thecontinuous distribution, conditioned on each trajectory, p(xc,t|xd,0:t,y1:t,u0:t)may be approximated using a Kalman Filter. In addition, both methods usethe same approach to evaluating the transition prior p(xd,t|xd,0:t−1,y1:t−1,u0:t)in the presence of autonomous mode transitions.

Finally, both methods approximate the posterior belief state by tracking onlya subset of the reachable mode trajectories. The key difference between thetwo methods is the approach used to select which mode trajectories to track; k-best enumeration greedily selects the k trajectories with the highest posteriorprobability p(xd,0:t|y1:t,u0:t), while RBPF uses stochastic sampling. To gaininsight into the relative behavior of these two methods, and their appropriatecombination, in Section 5.2 we evaluate their performance on a fault detectionscenario. We show that for a posterior distribution that is flat across manydistinct hypotheses, stochastic methods can be more successful than greedyenumeration.

However by representing the posterior probability p(xd,1:t|y1:t,u0:t) using thenumber of repeated particles, RBPF introduces additional approximation andtypically reduces the number of distinct trajectories tracked at any time. Thismeans that for a relatively concentrated posterior, k-best enumeration willtypically outperform RBPF. These results are demonstrated in Section 6 usingthe simulated acrobatic robot.

In Section 5.3 we develop a new algorithm that unifies the two approaches,exploiting the differences between them in order to increase the robustness

29

Page 30: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

... <no,ok><no,ok>

<no,failed>

......

...

tF-1

tF

k tra

jecto

ries

... <no,ok><no,ok> <no,failed>

......

? <no,failed>

tF-1

tF

...

Stochastic samplingResampling according

to importance weights

Failure detection with (a) k-best enumeration (b) RBPF

Fig. 18. In (a), the trajectories are shown ordered in terms of their posterior prob-ability, with the most likely at the top. Trajectories below the dashed line arediscarded. The true mode trajectory is shown in bold. In (b), the question markindicates that the generation of the successor corresponding to the true trajectoryis generated stochastically.

of the algorithm to changes in the variance of the posterior distribution. InSection 6.1 experimental results are presented that validate this insight.

5.2 Comparing Greedy and Stochastic Search

Consider the acrobatic robot modeled in Figures 1, 2 and 3. Suppose that therobot does not carry a ball, and its actuator is functional, up to time step tF ,when a failure occurs. As illustrated in Figure 18(a), the k-best enumerationalgorithm maintains the nominal trajectory, and will detect the failure, as longas a trajectory with the fault transition is among the set of leading trajectoriesat (or near) the time when the actual fault occurs 6 . By contrast, with theRBPF, mode transitions are sampled stochastically (Figure 18(b)). For thetrue trajectory to be tracked, the mode transition into the failure state mustbe sampled. Since this transition has a low prior many particles will be neededin order to detect the fault.

Now, consider a modified model, in which the probability of catching a ballis 0.5, rather than 0.01. In addition, let the mass of the ball be small, so that

6 It is insufficient to consider the fault transition several time steps later. A modetrajectory that has the fault trajectory occurring much later will have an estimateof the continuous mode that does not match the observations.

30

Page 31: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

the effect of catching a ball, on the observations, is relatively small. In thiscase, the posterior distribution over the trajectories will be flat, as there willbe many trajectories in the belief state that oscillate between the robot havingand not having a ball. Initially, these trajectories will have higher posteriorprobabilities than the ground-truth trajectory, because they have much higherpriors than the failure, and it takes several time steps before the evidence forthe failure builds up. The RBPF, on the other hand, will occasionally generatea fault sample even if there are many alternative trajectories that have a higherposterior probability. The fault trajectory may thus be present in the set oftrajectories despite having a lower posterior probability than other candidatetrajectories. This behavior makes the RBPF more robust to local maxima.

To summarize, the advantage of k-best enumeration is that it stores the pos-terior probability of each trajectory exactly, rather than approximating theprobability by a number of repeated particles. This observation has been notedin the past [11]. However, the ability of k-best enumeration to track the truetrajectory is critically dependent on the number of alternative trajectories withhigh posteriors in relation to the number of tracked trajectories k. In particu-lar, if the posterior distribution over the trajectories is flat, k-best enumerationwill not consider the right trajectory, and the RBPF performs better. This ob-servation motivates the development of our algorithm that combines k-bestenumeration and RBPF in a greedy and stochastic search.

5.3 A Novel Combined Greedy/Stochastic Algorithm

In this section we present a novel algorithm that combines the greedy andstochastic approaches, of k-best enumeration and RBPF respectively, to ex-plore the discrete mode trajectories of the system (Figure 19). The algorithmmaintains two sets of trajectories: one set of stochastically generated trajecto-ries, updated with RBPF, and a separate set of leading trajectories, enumer-ated according to their posterior probability. The key idea is to generate suc-cessors to the leading k trajectories through both previously deterministicallygenerated successors to the current k leading trajectories and the candidatesgenerated by the RBPF. In this manner, the true trajectory that was dis-carded by simple k-best enumeration on the basis of having a lower posteriorprobability than other trajectories can still be tracked in the RBPF particleset and included in the deterministic set at a later time. In Section 6 we showempirically that the new algorithm is more robust than k-best and RBPFtaken individually, with only a minor performance penalty.

The combination of greedy and stochastic search introduces two challenges.First, in order to enable the comparison of the two sets of trajectories, basedon the posterior probability, each particle needs to be augmented with b(xd,1:t),

31

Page 32: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

...

...RBPF RBPF

k -

be

st

tra

jecto

rie

sR

BP

F p

art

icle

s

tF-1

tF

tP

tP+1

Fig. 19. Failure detection with our mixed method. At time tF the true trajectory(bold line) is no longer among the trajectories with the k highest posteriors. How-ever, the true trajectory is sampled stochastically and becomes a member of theRBPF particle set. At time tP+1, the posterior of the true trajectory becomes largeenough for it to be included in the leading k trajectories.

the posterior probability of a mode sequence xd,1:t. This can be updated with(4), by reusing the values PT and PO, computed in the importance samplingstep of the RBPF.

Second, trajectories generated by both greedy and stochastic search must becombined to give the belief state maintained by the mixed algorithm. Bothtechniques maintain a number of trajectories and a Gaussian distribution overthe continuous state, conditioned on each trajectory xd,1:t. The belief staterepresentation in the new algorithm is a mixture of Gaussians, obtained bysummation over the k trajectories with the highest b(xd,1:t) from both RBPFand greedy successor enumeration.

The pseudocode for the resulting algorithm is shown in Figure 20. Part 2(b) ofthe algorithm implements an A* search for the k successors with the highestposterior. As in Section 2.3, partial paths in the search tree correspond topartial assignments of modes to components. A goal candidate is one that hasa full assignment of modes, and for which the posterior probability b(xd,1:t)has been calculated. The two key additions to the method in Section 2.3 areas follows:

(1) Addition of RBPF particles to search queue: In 2(a), the RBPF updatestep is carried out as described in Section 3.7, and the resulting particles

32

Page 33: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

1. Initialization

• Initialize Rao-Blackwellised Particle Filter with N − k particles

• Initialize k-best trajectories

2. For t = 1, 2, . . .

(a) RBPF update

• Do RBPF update and add candidates to priority queue

(b) A* search for successors with highest posterior

• Add current trajectories to priority queue

• While size(new kbest) < k do

– Remove candidate from priority queue

– If candidate is a goal then

∗ Pop candidates until a unique one is found

∗ Add unique candidate to new kbest

– Else

∗ Expand candidate to its successors

∗ Add successors to priority queue

• Normalize new k-best trajectories

Fig. 20. Belief state update using greedy and stochastic search.

are added to the search queue as candidates. To ensure soundness ofthe A* search, candidates are added with the unnormalized observationfunction used by [38].

(2) Checking for uniqueness: Many identical candidates are generated byRBPF and added to the search queue. The set new kbest, however, holdsunique trajectories. Part 2(b) ensures that unique trajectories only areadded to the set of k best.

In the code shown, uniqueness is ensured by removing duplicate candidatesfrom the priority queue until a unique candidate is found. This relies on the factthat identical candidates will be neighbors in the priority queue. Alternatively,candidates can be checked for uniqueness when they are pushed onto thequeue.

We have therefore proposed a new algorithm that mixes greedy and stochasticsearch in hybrid estimation. In Section 6 we show that this approach is morerobust than k-best and RBPF taken individually.

6 Simulation Results

In this section we present three main contributions. First, using individualestimation runs, we demonstrate empirically the insight in Section 5.2, thatboth k-best enumeration and Rao-Blackwellised Particle Filtering can performpoorly depending on whether the posterior distribution is concentrated in rel-atively few mode trajectories, or is flat across many. Second, we perform adetailed empirical analysis of the relative performance of k-best enumerationand RBPF that confirms this result and illustrates the need for our robust

33

Page 34: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

time [s]

θ1

nominalcaptured ballactuator failure

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

time [s]

θ2

nominalcaptured ballactuator failure

Fig. 21. The evolution of θ1 (left) and θ2 (right) for the acrobot model scenarios.

algorithm that balances the complementary approaches of greedy exploitationand stochastic exploration. Finally, we demonstrate that the new mixed al-gorithm is significantly more robust than either approach alone. We do notcompare the Rao-Blackwellised approach with a standard particle filteringapproach, since this analysis has been carried out extensively by [13,21,39];this analysis revealed that Rao-Blackwellisation increases the efficiency of theparticle filtering approach dramatically.

We consider the acrobatic robot shown in Figures 1 through 4. The goal ofhybrid estimation in this case is to filter out the acrobot’s hybrid state from asequence of noisy observations of θ2. We consider the following three scenariosfor hybrid estimation with the acrobot model:

(1) In the nominal scenario, the robot remains in the nominal mode 〈ball=no,actuator=ok〉 for the duration of the experiment.

(2) In the ball scenario, the robot captures a ball at time t = 1.3s and keepsit for the rest of the experiment. Capturing a ball increases the weightm2 at the end of the lower link and changes the resulting trajectory, asshown in Figure 21.

(3) In the failure scenario, the robot’s actuator breaks at t = 1.5. This eventcauses the actuator to stop exerting any torque, and alters the robot’strajectory, as shown in Figure 21.

6.1 Individual Estimation Runs

In this section we present results for hybrid estimation with the ball scenarioand the failure scenario for single executions of the k-best algorithm and thenew Rao-Blackwellised Particle Filter algorithm for CPHA.

34

Page 35: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

0 0.5 1 1.5 2 2.5 3 3.5 4<no,ok>

<yes,ok>

<no,failed>

<yes,failed>

time [s]

MAP estimate

ground truthkbest100rbpf100

0 0.5 1 1.5 2 2.5 3 3.5 40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

time [s]

p(co

rrec

t dia

gnos

is)

kbest100rbpf100

Fig. 22. A single run for the ball capture scenario with concentrated posterior. Left:Maximum a posteriori (MAP) mode estimate computed by the Rao-BlackwellisedParticle Filter (rbpf) and the k-best filter (kbest). Right: probability of the correctdiagnosis 〈ball=yes, actuator=ok〉 for t ≥ 1.3s. There is a delay between the balltransition and when it is identified as the most likely diagnosis because the posteriorprobability of the true mode trajectory builds up only over time. The low probabilityof a transition to 〈ball=yes, actuator=no〉 biases the prior towards the 〈ball=no,actuator=ok〉 diagnosis.

6.1.1 Concentrated Posterior

First, we examine the original acrobatic robot model, shown in Figures 2 and 4.In this model, the probability of a ball transition is 0.01 and the probability ofan actuator failure is 0.0005. The mass of the ball is 4kg. Since all transitionshave low priors, the true posterior distribution is concentrated in a relativelysmall number of discrete mode trajectories.

Figure 22 shows the maximum a posteriori (MAP) estimate of the discretestate by k-best and RBPF estimation filters for the ball scenario, for a singlesimulation run with 100 tracked sequences. Both k-best and RBPF algorithmsestimate the MAP diagnosis correctly, and track the continuous state closely,except for an uncertain area close to the transition. Figure 23 shows the track-ing of θ1 for the ball scenario.

Figure 24 shows the results for a single simulation run with the failure scenario.In this case the k-best algorithm is able to diagnose the fault after a delay,but the Rao-Blackwellised Particle Filter does not diagnose the correct mode,even after several seconds. As described in Section 5.2, this is because the lowprior of the fault transition makes it unlikely that the true mode trajectoryis sampled unless many more particles are used. The true mode trajectory istherefore discarded. By contrast, k-best enumeration is able to retain the truemode sequence, since the posterior distribution is concentrated in relativelyfew mode sequences.

35

Page 36: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

0 0.5 1 1.5 2 2.5 3 3.5 4−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

time [s]

θ1[rad]

ground truthkbest100 estimaterbpf100 estimate

Fig. 23. Filtered θ1 for an execution of the ball scenario.

0.5 1 1.5 2 2.5 3 3.5 4

<no,ok>

<yes,ok>

<no,failed>

<yes,failed>

time [s]

MAP estimate

ground truthkbest100rbpf100

0.5 1 1.5 2 2.5 3 3.5

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

time [s]

p(co

rrec

t dia

gnos

is)

kbest100rbpf100

Fig. 24. A single run for the actuator failure scenario with concentrated posterior.Left: Maximum a posteriori (MAP) estimate computed by the RBPF and the k-bestfilter. Right: probability of the correct diagnosis 〈ball=no, actuator=failed〉 fort ≥ 1.5s. The k-best algorithm is able to diagnose the fault after a delay; theprobability assigned to the ground truth drops close to zero immediately after thefault, but gradually increases as the observations reveal that this is in fact the mostlikely diagnosis. The RBPF algorithm fails to sample the true mode trajectory dueto its low prior, and hence the probability of the ground truth falls to zero andremains there.

6.1.2 Flat Posterior

We now consider single simulation results for a modified model, in which theposterior distribution is spread out across many distinct mode trajectories.The modified acrobot model has the same components as before, however themodel parameters are different. The new transition priors are shown in Fig-ure 25. Note that the probability of the acrobot catching a ball has changedfrom 0.05 to 0.5. Also, the mass of the ball is decreased to 1kg. The increasedtransition probability means that whenever θ1 ≥ 0.55 the number of modesequences with a high prior grows exponentially. Because the effect of a tran-

36

Page 37: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

actuator

ok

failed

0.99

1.0

ball

yes

no

55.01 ≤θ

1.0

0.5

55.01 >θ 0.5

55.01 ≤θ1.0

0.5

0.5

0.01

55.01 >θ

Fig. 25. Probabilistic hybrid automata for the body and actuator of the modifiedacrobot example

0 0.5 1 1.5 2 2.5 3 3.5 4<no,ok>

<yes,ok>

<no,failed>

<yes,failed>

time [s]

MAP estimate

ground truthkbest100rbpf100

0.5 1 1.5 2 2.5 3 3.5 4

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

time [s]

p(co

rrec

t dia

gnos

is)

kbest100rbpf100

Fig. 26. A single run for the actuator failure scenario with flat posterior. Left:Maximum a posteriori (MAP) estimate computed by the RBPF and the k-bestfilter. Right: probability of the correct diagnosis 〈ball=no, actuator=failed〉 fort ≥ 1.5s.

sition on observations is initially small, a large number of distinct trajectorieswill have high posterior probability.

The modified acrobot model is an example where the fair sampling of theRao-Blackwellised Particle Filter can outperform the greedy search in the k-best filter, as discussed in Section 5.2. Figure 26 shows the results for a singlehybrid estimation run on the failure scenario, this time with the flat poste-rior distribution. The k-best enumeration approach fails to diagnose the faultcorrectly because very many distinct hypotheses with high posterior proba-bility grow whenever the ball transition is enabled. The stochastic samplingapproach of Rao-Blackwellised Particle Filtering, on the other hand, samplesthe failure transition fairly. When a particle samples the transition close towhere it occurred in reality, the observation likelihood of that particle growsuntil it dominates the trajectory space, giving the correct diagnosis. With aflat posterior therefore, RBPF clearly outperforms k-best enumeration.

37

Page 38: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

0 0.5 1 1.5 2 2.5 3 3.5 4−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

time [s]

θ1[rad]

ground truthkbest100 estimaterbpf100 estimate

Fig. 27. Continuous tracking of θ1 with the failure scenario for a flat posterior.The RBPF has much more accurate continuous tracking than the k-best algorithmfor this example. The state variable θ1 is not observed directly, hence without anaccurate estimate of the discrete mode sequence, the continuous dynamics of thesystem are unknown, leading to large estimation errors in the continuous state

These experimental results, therefore, provide empirical validation of the in-sight described in Section 5.2. While these are examples of single simulationruns only, in Section 6.2 we carry out an extensive performance comparisonthat confirms this result and provides empirical motivation for a mixed methodthat balances greedy and stochastic approaches to hybrid estimation.

6.2 Performance Comparison

In this section we carry out a detailed empirical comparison of the perfor-mance of k-best enumeration and the new Rao-Blackwellised Particle Filterfor CPHA. We show that the new mixed method is significantly more robustthan either method alone.

6.2.1 Performance metrics

One of the biggest obstacles to evaluating the performance of hybrid stateestimation algorithms is that inference with hybrid models is, in general, NP-hard [10], and it is very difficult to obtain the true posterior distributionp(xc,t,xd,t|y1:t,u0:t). Sometimes, this distribution can be approximated by aparticle filter with a large number of samples; however, the accuracy of suchapproximations may not be bounded tightly enough.

Instead, we use the following two metrics for a given algorithm with a fixednumber of tracked sequences:

(1) The percentage of the diagnostic faults, defined as # of wrong diagnoses# time steps

.

38

Page 39: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

101

102

0

0.1

0.2

0.3

0.4

0.5

frac

tion

of d

iagn

ostic

err

ors

Number of tracked sequences

kbestrbpfmixed

101

102

0

0.1

0.2

0.3

0.4

0.5

Number of tracked sequences

MSE

/tim

este

p

kbestrbpfmixed

Fig. 28. Performance for the nominal scenario. Left: Percentage of diagnostic errors.Right: Mean square estimation error of the continuous state.

Wrong diagnoses are defined as MAP estimates of the discrete state atthe fringe that are not the same as the ground truth.

(2) The mean square estimation error of the continuous estimate correspond-ing to the MAP diagnosis. This is defined as ((xc,t−xc,t)

T (xc,t−xc,t))1/2,

where xc,t is the continuous estimate corresponding to the MAP modeestimate, and xc,t is the continuous state ground truth. This measure isaveraged over all time steps and experiments.

Each algorithm was run on 20 random observation sequences with fixed modeassignments. The results given here show the mean and standard deviation(shown as error bars) of the performance metrics for these runs. For the mixedmethod, half of the trajectories were evolved using the RBPF.

6.2.2 Concentrated Posterior

Figures 28 through 30 show the percentage of diagnostic errors and the meansquare tracking error for the three scenarios considered for the acrobot modelwith concentrated posterior.

Overall, the Rao-Blackwellised Particle Filter gives a higher number of diag-nostic errors than k-best for the reasons discussed in Section 5.2. The mixedmethod performs almost as well as k-best enumeration. These results are thesame when we compare the performance with respect to the run time of thealgorithms. Figure 31 shows this comparison. We found that the run times ofthe three algorithms were different by a factor less than 1.5.

6.2.3 Flat Posterior

We now compare the performance of the k-best algorithm and the RBPFwhen estimating the hybrid state of the modified acrobot model, which has

39

Page 40: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

100

101

102

103

104

−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7fr

actio

n of

dia

gnos

tic e

rror

s

Number of tracked sequences

kbestrbpfmixed

100

101

102

103

104

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Number of tracked sequences

MSE

/tim

este

p

kbestrbpfmixed

Fig. 29. Performance for the ball scenario. Left: Percentage of diagnostic errors.Right: Mean square estimation error of the continuous state. The k-best algorithmmakes almost no diagnostic errors. The Rao-Blackwellised Particle Filter on theother hand, makes more than 40 per cent diagnostic errors on average for small k;this decays to a minimum value as the number of tracked sequences increases. Inaddition, there is far greater variance in the case of the RBPF; in an applicationsuch as fault diagnosis, this is particularly undesirable, since reliable performanceis essential.

100

101

102

103

104

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

frac

tion

of d

iagn

ostic

err

ors

Number of tracked sequences

kbestrbpfmixed

100

101

102

103

104

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Number of tracked sequences

MSE

/tim

este

p

kbestrbpfmixed

Fig. 30. Performance for the actuator failure scenario with concentrated posterior.Left: Percentage of diagnostic errors. Right: Mean square estimation error of thecontinuous state. For small k, both RBPF and k-best have a high proportion ofdiagnostic errors; for k > 20, however, the k-best algorithm outperforms the RBPF.Also, whereas the performance of the RBPF improves gradually as k increases, thek-best algorithm shows a large shift in performance between k = 20 and k = 50.The mixed method behaves in a similar manner to k-best enumeration, except thatthe shift in performance occurs at approximately double the number of trackedsequences, since only half of the sequences are being evolved greedily.

40

Page 41: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

10−3

10−2

10−1

100

101

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

frac

tion

of d

iagn

ostic

err

ors

Runtime [s / time step]

kbestrbpfmixed

10−3

10−2

10−1

100

101

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Runtime [s / time step]

MSE

/tim

este

p

kbestrbpfmixed

Fig. 31. Performance for the failure scenario for concentrated posterior with respectto run time. Left: Percentage of diagnostic errors. Right: Mean square estimationerror of the continuous state.

100

101

102

103

104

−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

frac

tion

of d

iagn

ostic

err

ors

Number of tracked sequences

kbestrbpfmixed

100

101

102

103

104

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Number of tracked sequences

MSE

/tim

este

p

kbestrbpfmixed

Fig. 32. Performance for the failure scenario with flat posterior. Left: Percentage ofdiagnostic errors. Right: Mean square estimation error of the continuous state. Thek-best method performs very poorly, except for a very large number of tracked se-quences, while the RBPF performs well. The mixed method has similar performanceto the RBPF.

a flat posterior distribution. Figure 32 shows the performance of the k-bestand RBPF algorithms for the failure scenario. In this case, the RBPF clearlyoutperforms the k-best method both in terms of diagnostic errors and meansquare estimation error. The large number of sequences with high posteriorprevents the strict enumeration of the k-best method from considering theinitially less likely failure sequence, except with very large k, while the RBPFsamples the actuator failure transition fairly. The mixed method exploits thisstochastic approach, performing only slightly worse than the RBPF alone.

41

Page 42: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

6.3 Discussion of Results

In the experimental results an interesting pattern emerges: the k-best algo-rithm undergoes a phase shift in performance, depending on whether or not kis large enough for trajectories similar to the ground truth to be tracked. Bycontrast, the performance of the Rao-Blackwellised Particle Filter convergesless quickly than the k-best filter to a low fraction of diagnostic errors (SeeFigure 30). Since the RBPF approximates the true posterior and duplicateshigh likelihood hypotheses, increasing the number of particles simply makesthe approximation closer to the true posterior. Hence there is a gradual con-vergence and not the phase shift seen for k-best.

The critical value of k that greedy enumeration needs to track in order toperform well, depends on the concentration of the posterior distribution. If theposterior distribution is concentrated in a relatively small number of distinctsequences the k-best method will perform well even for small k. In these cases,k-best enumeration typically outperforms RBPF, since it does not duplicatehypotheses, leading to unnecessary approximation. For a flat posterior, onthe other hand, the critical value of k is so large when detecting rare eventsthat many trajectories will need to be tracked, to reliably detect the fault. Inthese cases, the RBPF will perform better for small k, since the distributionis sampled fairly.

We have shown, empirically, that the mixed method combines the benefits ofthe two methods. While its performance is marginally worse than the RBPF ork-best in their best-case scenario, by balancing greedy and stochastic search,the new method is much more robust to the choice of model parameters thanthe RBPF and k-best individually.

7 Related work

Several algorithms have addressed the problem of the exponential growthof Gaussian mixtures. One class of solutions are multiple-model estimationschemes, which maintain a pre-determined number of Gaussians. These in-clude the generalized pseudo-Bayesian algorithm (GPB) [40], the DetectionEstimation Algorithm (DEA) [41], the Interacting Multiple Model (IMM) al-gorithm [27], and the residual correlation Kalman filter bank [42]. All of thesetechniques have a fixed-time, deterministic strategy for pruning discrete modesequences.

More recently, [28] proposed a k-best filtering solution for SLDS models. Inaddition to pruning, their algorithm implements several techniques not present

42

Page 43: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

in our algorithm, including collapsing of the mode sequences, smoothing, andweak decomposition. This approach was later extended in [11], to the settingof hybrid dynamic Bayesian networks with SoftMax transitions, using numer-ical integration techniques instead of the Kalman Filter. Similar to ours, theiralgorithm provides an any-time solution to the hybrid state estimation prob-lem.

In the particle filtering community, several papers [14,43,44,45] have proposedusing the bootstrap particle filter to perform state estimation in hybrid mod-els. An early application of the Rao-Blackwellisation method to reducing thevariance of sampling in SLDS models was introduced by [22]. Their algorithm,named the Random Sampling Algorithm (RSA), sampled the sequences ofmode assignments using the distribution p(xd,t|xd,0:t−1,y1:t,u0:t). [46] and [47]introduced the Selection Step, which is crucial for the convergence of sequen-tial Monte Carlo methods and framed the problem in the general particlefiltering framework. In addition, they proved several properties regarding theconvergence and variance reduction of Rao-Blackwellisation schemes. [43] fur-ther extended this work and described an algorithm for fixed-lag smoothingwith MCMC steps. Finally, [13] introduced a procedure, called one-step look-ahead, which computes the total probability for the sequences stemming froma given sample and moves the selection step before the importance samplingstep, at the cost of evaluating the Kalman Filter residual for all successormodes. All of these techniques were designed for linear switching models with-out autonomous transitions.

Concurrently with our work published in [18], the combination of a Rao-Blackwellised particle filter with an Unscented Kalman Filter for fault de-tection was proposed independently by [21]. In addition, they incorporatedthe look-ahead step proposed by [13].

Two complementary approaches for improving the performance of particle fil-ters were proposed by [31] and [48]. The first one, the Risk-sensitive ParticleFilter, incorporates a model of cost into the sampling process. The cost isimplemented automatically using an MDP value function tracking. The sec-ond approach improves the performance of particle filtering by automaticallychoosing an appropriate level of abstraction in a multiple-resolution hybridmodel. Maintaining samples at a lower resolution prevents hypotheses frombeing eliminated due to a lack of samples.

43

Page 44: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

8 Conclusion

In this paper, we investigated the problem of estimating the state of a systemrepresented with probabilistic hybrid models. We presented an efficient Rao-Blackwellised particle filtering algorithm, developed in Section 3, that handlesthe autonomous mode transitions, concurrency, and nonlinearities present inConcurrent Probabilistic Hybrid Automata (CPHA). We extended the class ofautonomous mode transitions that can be handled by both k-best enumerationand Rao-Blackwellised Particle Filtering for CPHA to multivariable lineartransition guards, in the case of piecewise constant transition distributions,and to polynomial transition distributions of arbitrary order, in the case ofsingle variables.

This new algorithm allows a unified treatment of approximate hybrid estima-tion through both Rao-Blackwellised particle filtering and k-best filtering. Asimulated acrobatic robot was used to develop insight about the relative per-formance of the two approaches. The results showed that when the posterioris concentrated in a few nominal or single-fault sequences, the k-best filteris a clear winner. However, when the distribution over mode trajectories isrelatively flat, the trajectory corresponding to the correct diagnosis may beleft out of the leading set of mode sequences. In such situations, the randomsampling approach of the Rao-Blackwellised particle filter is more successful.

We therefore developed a new algorithm that combines greedy and stochas-tic search by tracking two sets of mode trajectories with k-best and Rao-Blackwellised particle filters, and uses both sets to generate the set of leadingtrajectories at the next time step. Simulations showed that the algorithm ismore robust than k-best and RBPF taken individually, with only a minorperformance penalty.

References

[1] Mars science laboratory, http://mars.jpl.nasa.gov/msl/ (2006).

[2] M. Montemerlo, J. Pineau, N. Roy, S. Thrun, V. Verma, Experiences witha mobile robotic guide for the elderly, in: Proceedings of the AAAI NationalConference on Artificial Intelligence, AAAI, Edmonton, Canada, 2002.

[3] B. C. Williams, S. Chung, V. Gupta, Mode estimation of model-based programs:Monitoring systems with complex behavior, in: Proceedings of the InternationalJoint Conference on Artificial Intelligence, 2001.

[4] O. B. Martin, B. C. Williams, M. D. Ingham, Diagnosis as approximate beliefstate enumeration for probabilistic concurrent constraint automata, in: 20thNational Conference on Artificial Intelligence, 2005.

44

Page 45: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

[5] M. Hofbaur, B. C. Williams, Mode estimation of probabilistic hybrid systems,in: Intl. Conf. on Hybrid Systems: Computation and Control, 2002.

[6] C. P. Gomes, B. Selman, H. Kautz, Boosting combinatorial search throughrandomization, in: Proceedings of the 15th National Conference on ArtificialIntelligence, 1998.

[7] R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction, MITPress, 1998.

[8] M. W. Hofbaur, B. C. Williams, Hybrid estimation of complex systems, IEEETransactions on Systems, Man, and Cybernetics - Part B: Cybernetics.

[9] V. Verma, J. Langford, R. Simmons, Non-parametric fault identificationfor space rovers, in: International Symposium on Artificial Intelligence andRobotics in Space (iSAIRAS), 2001.

[10] U. Lerner, R. Parr, Inference in hybrid networks: Theoretical limits andpractical algorithms, in: Proceedings of the 17th Annual Conference onUncertainty in Artificial Intelligence (UAI-01), Seattle, Washington, 2001, pp.310–318.

[11] U. Lerner, Hybrid bayesian networks for reasoning about complex systems,Ph.D. thesis, Stanford University (October 2002).

[12] A. Doucet, On sequential simulation-based methods for bayesian filtering,Tech. Rep. CUED/F-INFENG/TR. 310, Cambridge University Department ofEngineering (1998).

[13] R. Morales-Menendez, N. de Freitas, D. Poole, Real-time monitoring of complexindustrial processes with particle filters, in: Proceedings of Neural InformationProcessing Systems (NIPS), 2002.

[14] R. Dearden, D. Clancy, Particle filters for real-time fault detection in planetaryrovers, in: Proceedings of the 13th International Workshop on Principles ofDiagnosis (DX02), 2002, pp. 1–6.

[15] A. Doucet, N. de Freitas, K. Murphy, S. Russell, Rao-Blackwellised particlefiltering for dynamic bayesian networks, in: Proceedings of the Conference onUncertainty in Artificial Intelligence (UAI), 2000.

[16] N. de Freitas, Rao-Blackwellised particle filtering for fault diagnosis, IEEEAerospace.

[17] X. Koutsoukos, J. Kurien, F. Zhao, Monitoring and diagnosis of hybrid systemsusing particle filtering methods, in: MTNS 2002, 2002.

[18] S. Funiak, B. C. Williams, Multi-modal particle filtering for hybrid systemswith autonomous mode transitions, in: Proceedings of SafeProcess 2003 (alsopublished in DX-2003, 2003.

[19] B. Anderson, J. Moore, Optimal Filtering, Information and System SciencesSeries, Prentice Hall, 1979.

45

Page 46: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

[20] S. J. Julier, J. K. Uhlmann, A new extension of the Kalman filter tononlinear systems, in: Proceedings of AeroSense: The 11th Symposium onAerospace/Defense Sensing, Simulation and Controls, 1997.

[21] F. Hutter, R. Dearden, The gaussian particle filter for diagnosis of non-linearsystems, in: Proceedings of the 14th International Conference on Principles ofDiagnosis(DX’03), Washington, DC, USA, 2003, pp. 65–70.

[22] H. Akashi, H. Kumamoto, Random sampling approach to state estimation inswitching environments, Automatica 13 (1977) 429–434.

[23] V. Pavlovic, J. Rehg, T.-J. Cham, K. Murphy, A dynamic Bayesian networkapproach to figure tracking using learned dynamic models, in: Proceedings ofICCV, 1999.

[24] S. Narasimhan, G. Biswas, G. Karasai, F. Zhao, Building observers to addressfault isolation and control problems in hybrid dynamic systems, in: Proceedingsof the IEEE Internation Conference on Systems, Man, and Cybernetics (SMC2000), 2000, pp. 2393–2398.

[25] B. Buchberger, F. Winkler (Eds.), Grobner Bases and Applications, CambridgeUniv. Press, 1998.

[26] P. Nayak, Automated Modelling of Physical Systems, Lecture Notes in ArtificialIntelligence, Springer, 1995.

[27] H. Blom, Y. Bar-Shalom, The interacting multiple model algorithm for systemswith Markovian switching coefficients, IEEE Transactions on AutomaticControl 33.

[28] U. Lerner, R. Parr, D. Koller, G. Biswas, Bayesian fault detection and diagnosisin dynamic systems, in: Proc. of the 17th National Conference on A. I., 2000,pp. 531–537.URL citeseer.nj.nec.com/lerner00bayesian.html

[29] B. C. Williams, P. Nayak, A model-based approach to reactive self-configuringsystems, in: Proceedings of AAAI-96, 1996, pp. 971–978.

[30] B. Ng, L. Peshkin, A. Pfeffer, Factored particles for scalable monitoring,in: Proceedings of the Eighteenth Conference on Uncertainty in ArtificialIntelligence, 2002.

[31] V. Verma, S. Thrun, R. Simmons, Variable resolution particle filter, in: InProceedings of International Joint Conference on Artificial Intelligence, 2003.

[32] G. Casella, C. P. Robert, Rao-Blackwellisation of sampling schemes, Biometrika83 (1) (1996) 81–94.

[33] K. Murphy, S. Russell, Rao-Blackwellised particle filtering for dynamic Bayesiannetworks, in: A. Doucet, N. de Freitas, N. Gordon (Eds.), Sequential MonteCarlo Methods in Practice, Springer-Verlag, 2001, Ch. 24, pp. 499–515.

46

Page 47: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

[34] T. Dean, K. Kanazawa, A model for reasoning about persistence and causation,Computational Intelligence 5 (3) (1989) 142–150.

[35] P. Nayak, B. C. Williams, Fast context switching in real-time propositionalreasoning, in: Proceedings of AAAI-97, 1997, pp. 50–56.

[36] H. Joe, Approximations to multivariate normal rectangle probabilities basedon conditional expectations, Journal of the American Statistical Association90 (431) (1995) 957–964.

[37] A. Genz, K.-S. Kwong, Numerical evaluation of singular multivariate normaldistributions, J. Stat. Comp. Simul 68 (1-21).

[38] P. Maybeck, R. Stevens, Reconfigurable flight control via multiple modeladaptive control methods, IEEE Transactions on Aerospace and ElectronicSystems 27 (3) (1991) 470–480.

[39] A. Doucet, N. J. Gordon, V. Krishnamurthy, Particle filters for state estimationof jump markov linear systems, IEEE Transactions on Signal Processing 49 (3)(2001) 613–624.

[40] G. Ackerson, K. Fu, On state estimation in switching environments, IEEETransactions on Automatic Control 15 (1970) 10–17.

[41] J. Tugnait, Detection and estimation for abruptly changing systems,Automatica 18 (1982) 607–615.

[42] P. Hanlon, P. Maybeck, Multiple-model adaptive estimation using a residualcorrelation kalman filter bank, IEEE Transactions on Aerospace and ElectronicSystems 36 (2) (2000) 393–406.

[43] A. Doucet, N. de Freitas, N. J. Gordon, Sequential Monte Carlo Methods inPractice, Springer-Verlag, 2001.

[44] D. Avitzour, A stochastic simulation Bayesian approach to multitarget tracking,IEE Proceedings on Radar Sonar and Navigation 142 (2) (1995) 41–44.

[45] G. Kitagawa, Monte Carlo filter and smoother for non-Gaussian nonlinear statespace models, Journal of Computational and Graphical Statistics 5 (1996) 1–25.

[46] N. Metropolis, S. Ulam, The monte carlo method, American StatisticalAssociation 44 (1949) 335–341.

[47] N. J. Gordon, D. J. Salmond, A. F. M. Smith, Novel approach to nonlinear/non-GaussianBayesian state estimation, IEE Proceedings - F 140 (2) (1993) 107–113.

[48] S. Thrun, J. Langford, V. Verma, Risk sensitive particle filters, in: NeuralInformation Processing Systems (NIPS), 2001.

47

Page 48: A Combined Stochastic and Greedy Hybrid Estimation Capability …groups.csail.mit.edu/mers/papers/BlackmoreFuniakWilliamsRAS06.pdf · Hybrid Automata (CPHA) are convenient tools for

Lars Blackmore has B.A. and M.Eng. degrees from the Uni-

versity of Cambridge (UK) and is currently a PhD candidate at MITin the department of Aeronautics and Astronautics. He has pub-lished seven papers in refereed conference proceedings. His researchinterests include estimation and control of robotic and autonomoussystems under stochastic uncertainty.

Stanislav Funiak holds B.S. and M.Eng. degrees from MIT

and is currently a Ph.D. student in Robotics at Carnegie MellonUniversity. He pursues research in probabilistic reasoning, machinelearning, and distributed systems. He is interested in algorithmsand modeling formalisms that enable robust solutions to real-worldproblems in sensor networks.

Brian C. Williams is an associate professor at MIT in the

department of Computer Science and Electrical Engineering. Hereceived his PhD from MIT in 1989. He pioneered multiple fault,model-based diagnosis at the Xerox Palo Alto Research Center, andat NASA Ames Research Center formed the Autonomous SystemsArea. He co-invented the Remote Agent model-based autonomouscontrol system, which flew on the NASA Deep Space One probein 1999. He has served as guest editor of the Artificial Intelligence

Journal and has been on the editorial board of the Journal of Artificial Intelligence Research.He is currently a member of the Advisory Council of the NASA Jet Propulsion Laboratoryat Caltech. Prof. Williams’ research concentrates on model-based autonomy - the creationof long-lived autonomous systems that are able to explore, command, diagnose and repairthemselves using fast, commensense reasoning.

48