Machine learning methods in StatisticalModel Checking and System Design
Guido Sanguinetti1,2
1Adaptive and Neural Computation, School of InformaticsUniversity of Edinburgh
2SynthSys, Synthetic and Systems Biology, University of Edinburgh
joint work with D. Milios (Edinburgh), and Luca Bortolussi (Saarbruecken/ Trieste)
RV Tutorial, 22/09/15
Outline
Motivation and backgroundProperties and qualitative dataCTMCs and model checking
Smoothed model checkingUncertainty and smoothnessGaussian processesSmoothed model checking results
Robust system design from formal specificationSignal Temporal LogicGP-UCB optimisationExperimental Results
U-check
Broad picture
Machine learning and formal methodsFormal methods give us tools to run and reason about models.Machine learning gives us methods to refine models in the lightof observations.
Ultra-condensed summary of this talkSome formal modelling questions can be viewed as functionestimation problems, for which machine learning methods yieldefficient solutions.
Example 1
RbsRibosome
Ribosome
PLac
PLacRNAP
RNAP
TrLacZ1
TrLacZ2
RbsLacZ
TrRbsLacZLacZ
dgrRbsLacZ
dgrLacZ
r_1, r_2 r_3
r_4r_5
r_6, r_7
r_8
r_9
r_10
r_11
Transcription
Translation
We build a model of a real system to investigate properties ofthe trajectories of the system.
Example 2
Sales pitch
I Building models of real-world systems involves uncertainty(typically in the parameters, but also in the structure)
I Verifying/ designing properties on uncertain models isunfeasible in a brute force way
I Rest of the talk: use a smoothness result and treat theproblem as a prediction/ optimisation problem in machinelearning
Sales pitch
I Building models of real-world systems involves uncertainty(typically in the parameters, but also in the structure)
I Verifying/ designing properties on uncertain models isunfeasible in a brute force way
I Rest of the talk: use a smoothness result and treat theproblem as a prediction/ optimisation problem in machinelearning
The problems tackledSatisfaction under Uncertainty Problem
Can we compute satisfaction probabilities of formal propertiesfor uncertain models?
Parameter Estimation problem
Can we learn model parameters from qualitative data, i.e. fromobservations of truth values of formal (temporal logic)
properties?
Synthesis problem
Can we learn model parameters satisfying (as robustly aspossible) a set of formal (temporal logic) specifications?
The problems tackledSatisfaction under Uncertainty Problem
Can we compute satisfaction probabilities of formal propertiesfor uncertain models?
Parameter Estimation problem
Can we learn model parameters from qualitative data, i.e. fromobservations of truth values of formal (temporal logic)
properties?
Synthesis problem
Can we learn model parameters satisfying (as robustly aspossible) a set of formal (temporal logic) specifications?
The problems tackledSatisfaction under Uncertainty Problem
Can we compute satisfaction probabilities of formal propertiesfor uncertain models?
Parameter Estimation problem
Can we learn model parameters from qualitative data, i.e. fromobservations of truth values of formal (temporal logic)
properties?
Synthesis problem
Can we learn model parameters satisfying (as robustly aspossible) a set of formal (temporal logic) specifications?
Population CTMC
We consider population CTMC models, that are CTMC modelsin the biochemical reactions style.
State spaceThe state space is described by a set of n variablesX = X1, . . . ,Xn ∈ N, each counting the number of agents (jobs,molecules, ...) of a given kind.
TransitionsThe dynamics is given by a set of chemical reactions, of theform
s1X1 + . . . + snXn → r1X1 + . . . + rnXn,
with a rate given by a function f (X, θ), depending on the systemvariables and on a set of parameters θ.
Population CTMC
We consider population CTMC models, that are CTMC modelsin the biochemical reactions style.
State spaceThe state space is described by a set of n variablesX = X1, . . . ,Xn ∈ N, each counting the number of agents (jobs,molecules, ...) of a given kind.
TransitionsThe dynamics is given by a set of chemical reactions, of theform
s1X1 + . . . + snXn → r1X1 + . . . + rnXn,
with a rate given by a function f (X, θ), depending on the systemvariables and on a set of parameters θ.
Population CTMC
We consider population CTMC models, that are CTMC modelsin the biochemical reactions style.
State spaceThe state space is described by a set of n variablesX = X1, . . . ,Xn ∈ N, each counting the number of agents (jobs,molecules, ...) of a given kind.
TransitionsThe dynamics is given by a set of chemical reactions, of theform
s1X1 + . . . + snXn → r1X1 + . . . + rnXn,
with a rate given by a function f (X, θ), depending on the systemvariables and on a set of parameters θ.
Metric Interval Temporal Logic
I We express qualitative properties by Metric (interval)Temporal Logic.
I Linear time: in experiments we observe single realisations.I Metric bounds: we can observe a system only for a finite
amount of time.
Syntax
ϕ ::= tt | µ | ¬ϕ | ϕ1 ∧ ϕ2 | ϕ1U[T1,T2]ϕ2,
As customary: F[T1,T2]ϕ ≡ ttU[T1,T2]ϕ, G[T1,T2]ϕ ≡ ¬F[T1,T2]¬ϕ.
Metric Interval Temporal Logic
I We express qualitative properties by Metric (interval)Temporal Logic.
I Linear time: in experiments we observe single realisations.
I Metric bounds: we can observe a system only for a finiteamount of time.
Syntax
ϕ ::= tt | µ | ¬ϕ | ϕ1 ∧ ϕ2 | ϕ1U[T1,T2]ϕ2,
As customary: F[T1,T2]ϕ ≡ ttU[T1,T2]ϕ, G[T1,T2]ϕ ≡ ¬F[T1,T2]¬ϕ.
Metric Interval Temporal Logic
I We express qualitative properties by Metric (interval)Temporal Logic.
I Linear time: in experiments we observe single realisations.I Metric bounds: we can observe a system only for a finite
amount of time.
Syntax
ϕ ::= tt | µ | ¬ϕ | ϕ1 ∧ ϕ2 | ϕ1U[T1,T2]ϕ2,
As customary: F[T1,T2]ϕ ≡ ttU[T1,T2]ϕ, G[T1,T2]ϕ ≡ ¬F[T1,T2]¬ϕ.
MITL semantics
Semanticsµ are (non-linear) inequalities on vectors of n variablesI x, t |= µ if and only if µ(x(t)) = tt;I x, t |= ϕ1U[T1,T2]ϕ2 if and only if ∃t1 ∈ [t + T1, t + T2] such
that x, t1 |= ϕ2 and ∀t0 ∈ [t , t1], x, t0 |= ϕ1.
As customary: F[T1,T2]ϕ ≡ ttU[T1,T2]ϕ, G[T1,T2]ϕ ≡ ¬F[T1,T2]¬ϕ.
Can have both qualitative (boolean) and quantitative(robustness) semantics.
ExampleG[0,100](XI < 40) ∧ F[0,60]G[0,40](5 ≤ XI ≤ 20) ∧G[30,50](XI > 30)encodes transient behaviour.
MITL semantics
Semanticsµ are (non-linear) inequalities on vectors of n variablesI x, t |= µ if and only if µ(x(t)) = tt;I x, t |= ϕ1U[T1,T2]ϕ2 if and only if ∃t1 ∈ [t + T1, t + T2] such
that x, t1 |= ϕ2 and ∀t0 ∈ [t , t1], x, t0 |= ϕ1.
As customary: F[T1,T2]ϕ ≡ ttU[T1,T2]ϕ, G[T1,T2]ϕ ≡ ¬F[T1,T2]¬ϕ.
Can have both qualitative (boolean) and quantitative(robustness) semantics.
ExampleG[0,100](XI < 40) ∧ F[0,60]G[0,40](5 ≤ XI ≤ 20) ∧G[30,50](XI > 30)encodes transient behaviour.
MITL semantics
Semanticsµ are (non-linear) inequalities on vectors of n variablesI x, t |= µ if and only if µ(x(t)) = tt;I x, t |= ϕ1U[T1,T2]ϕ2 if and only if ∃t1 ∈ [t + T1, t + T2] such
that x, t1 |= ϕ2 and ∀t0 ∈ [t , t1], x, t0 |= ϕ1.
As customary: F[T1,T2]ϕ ≡ ttU[T1,T2]ϕ, G[T1,T2]ϕ ≡ ¬F[T1,T2]¬ϕ.
Can have both qualitative (boolean) and quantitative(robustness) semantics.
ExampleG[0,100](XI < 40) ∧ F[0,60]G[0,40](5 ≤ XI ≤ 20) ∧G[30,50](XI > 30)encodes transient behaviour.
Model Checking
I Model checking: Given a stochastic process and a formula,compute the probability $ of the formula being true
I Problem: almost always analytically impossible/ oftencomputationally unfeasible
I Statistical model checking: $ can be estimated as anaverage of the truth values Ti of the formula over manysamples
I Can be given a Bayesian flavour by specifying priors p($)and estimating a posteriori p($|Ti) using the fact thatp(Ti |$) is Bernoulli and Bayes’ theorem
I This is equivalent to introducing pseudocounts→ usefulwhen rare events could be present
Model Checking
I Model checking: Given a stochastic process and a formula,compute the probability $ of the formula being true
I Problem: almost always analytically impossible/ oftencomputationally unfeasible
I Statistical model checking: $ can be estimated as anaverage of the truth values Ti of the formula over manysamples
I Can be given a Bayesian flavour by specifying priors p($)and estimating a posteriori p($|Ti) using the fact thatp(Ti |$) is Bernoulli and Bayes’ theorem
I This is equivalent to introducing pseudocounts→ usefulwhen rare events could be present
Model Checking
I Model checking: Given a stochastic process and a formula,compute the probability $ of the formula being true
I Problem: almost always analytically impossible/ oftencomputationally unfeasible
I Statistical model checking: $ can be estimated as anaverage of the truth values Ti of the formula over manysamples
I Can be given a Bayesian flavour by specifying priors p($)and estimating a posteriori p($|Ti) using the fact thatp(Ti |$) is Bernoulli and Bayes’ theorem
I This is equivalent to introducing pseudocounts→ usefulwhen rare events could be present
Outline
Motivation and backgroundProperties and qualitative dataCTMCs and model checking
Smoothed model checkingUncertainty and smoothnessGaussian processesSmoothed model checking results
Robust system design from formal specificationSignal Temporal LogicGP-UCB optimisationExperimental Results
U-check
Uncertainty in models
I Quantitative models require parameters. Is domainexpertise sufficient to select a particular value of theparameters (out of an uncountable set)?
I Certainly not in many applications like systems biologyI What are the implications for model checking?I Satisfaction probabilities of formulae are defined for fully
specified models.
All model checking algorithms rely on a fully specified model
Some further definitions
Uncertain CTMCConsider a Population CTMC modelM depending on a set of dparameters θ. We only know θ ∈ D, but not their precise value.We callM an Uncertain CTMC (notation: Mθ for fixed θ).
Satisfaction functionSuppose we have a UCTMCM, θ ∈ D, and a Metric TemporalLogic formula ϕ. The the satisfaction probability of ϕ w.r.t. M isa function of θ:
p(ϕ | θ) = Prob{Mθ, x ,0 |= ϕ}.
We call p(ϕ | θ) the satisfaction function.
Goal: can we find statistical methods to approximate p(ϕ | θ)analytically?
Some further definitions
Uncertain CTMCConsider a Population CTMC modelM depending on a set of dparameters θ. We only know θ ∈ D, but not their precise value.We callM an Uncertain CTMC (notation: Mθ for fixed θ).
Satisfaction functionSuppose we have a UCTMCM, θ ∈ D, and a Metric TemporalLogic formula ϕ. The the satisfaction probability of ϕ w.r.t. M isa function of θ:
p(ϕ | θ) = Prob{Mθ, x ,0 |= ϕ}.
We call p(ϕ | θ) the satisfaction function.
Goal: can we find statistical methods to approximate p(ϕ | θ)analytically?
Some further definitions
Uncertain CTMCConsider a Population CTMC modelM depending on a set of dparameters θ. We only know θ ∈ D, but not their precise value.We callM an Uncertain CTMC (notation: Mθ for fixed θ).
Satisfaction functionSuppose we have a UCTMCM, θ ∈ D, and a Metric TemporalLogic formula ϕ. The the satisfaction probability of ϕ w.r.t. M isa function of θ:
p(ϕ | θ) = Prob{Mθ, x ,0 |= ϕ}.
We call p(ϕ | θ) the satisfaction function.
Goal: can we find statistical methods to approximate p(ϕ | θ)analytically?
Key property of satisfaction function
Theorem (Bortolussi, Milios and G.S. 2014)Consider a Population CTMC modelM whose transition ratesdepend polynomially on a set of d parameters θ ∈ D. Let ϕ be aMITL formula. The satisfaction function p(ϕ | θ) : D → [0,1] is asmooth function of the parameters θ.
This means that we can transfer information acrossneighbouring points. Can we define a statistical model
checking technique which simultaneously computes the wholesatisfaction function?
Key property of satisfaction function
Theorem (Bortolussi, Milios and G.S. 2014)Consider a Population CTMC modelM whose transition ratesdepend polynomially on a set of d parameters θ ∈ D. Let ϕ be aMITL formula. The satisfaction function p(ϕ | θ) : D → [0,1] is asmooth function of the parameters θ.
This means that we can transfer information acrossneighbouring points. Can we define a statistical model
checking technique which simultaneously computes the wholesatisfaction function?
Smoothness proof (sketch)
I Use the fact that the space of trajectories of a CTMC is acountable union of finite dimensional spaces (indexed bythe finite sequence of transitions)
I Write explicitly the measure on each finite dimensionalspace
I Show that the series of derivatives converges
Random functions
I Bayesian SMC uses the fact the satisfaction probability ofa formula given a model is a number in [0,1], and priordistributions on numbers between [0,1] exist (Betadistribution)
I The object we are interested in is a smooth function. Howdo you construct probability distributions over randomfunctions?
I Simplest idea: fix a set of functions ξi(θ) i = 1, . . . ,N(basis functions)
I Take a linear combination f of the basis function withrandom coefficients wi , f =
∑i wiξi(θ)
I If the coefficients are (multivariate) Gaussian distributed,the value of f at a point θ̂ is Gaussian distributed
Random functions
I Bayesian SMC uses the fact the satisfaction probability ofa formula given a model is a number in [0,1], and priordistributions on numbers between [0,1] exist (Betadistribution)
I The object we are interested in is a smooth function. Howdo you construct probability distributions over randomfunctions?
I Simplest idea: fix a set of functions ξi(θ) i = 1, . . . ,N(basis functions)
I Take a linear combination f of the basis function withrandom coefficients wi , f =
∑i wiξi(θ)
I If the coefficients are (multivariate) Gaussian distributed,the value of f at a point θ̂ is Gaussian distributed
Important exercise
Let ϕ1(x), . . . , ϕN(x) be a fixed set of functions, and letf (x) =
∑wiϕi(x). If w ∼ N(0, I), compute:
1. The single-point marginal distribution of f (x)
2. The two-point marginal distribution of f (x1), f (x2)
I Obviously both distributions are zero-mean GaussiansI To compute the (co)variance, take products and
expectations and remember that 〈wiwj〉 = δij
I Defining ϕ(x) = (ϕ1(x), . . . , ϕN(x)), we get that
〈f (xi)f (xj)〉 = ϕ(xi)Tϕ(xj)
Important exercise
Let ϕ1(x), . . . , ϕN(x) be a fixed set of functions, and letf (x) =
∑wiϕi(x). If w ∼ N(0, I), compute:
1. The single-point marginal distribution of f (x)
2. The two-point marginal distribution of f (x1), f (x2)
I Obviously both distributions are zero-mean GaussiansI To compute the (co)variance, take products and
expectations and remember that 〈wiwj〉 = δij
I Defining ϕ(x) = (ϕ1(x), . . . , ϕN(x)), we get that
〈f (xi)f (xj)〉 = ϕ(xi)Tϕ(xj)
The Gram matrix
I Generalising the exercise to more than two points, we getthat any finite dimensional marginal of this process ismultivariate Gaussian
I The covariance matrix of this function is given byevaluating a function of two variables at all possible pairs
I The function is defined by the set of basis functions
k(xi , xj) = ϕ(xi)Tϕ(xj)
I The covariance matrix is often called Gram matrix and is(necessarily) symmetric and positive definite
I Bayesian prediction in regression then is essentially thesame as computing conditionals for Gaussians (more later)
Main limitation of basis function regression
I Choice of basis functions inevitably impacts what can bepredicted
I Suppose one wishes the basis functions to tend to zero asx → ∞. What would we predict for very large input values?
I Very large input values will have predicted outputs nearzero with high confidence!
I Ideally, one would want a prior over functions which wouldhave the same uncertainty everywhere
Main limitation of basis function regression
I Choice of basis functions inevitably impacts what can bepredicted
I Suppose one wishes the basis functions to tend to zero asx → ∞. What would we predict for very large input values?
I Very large input values will have predicted outputs nearzero with high confidence!
I Ideally, one would want a prior over functions which wouldhave the same uncertainty everywhere
Stationary variance
I We have seen that the variance of a random combinationof functions depends on space as
∑ϕ2
i (x)
I Given any compact set, (e.g. hypercube with centre in theorigin), we can find a finite set of basis functions s.t.∑ϕ2
i (x) = const (partition of unity, e.g. triangulations orsmoother alternatives)
I We can construct a sequence of such sets which coversthe whole of RD in the limit
I Therefore, we can construct a sequence of priors which allhave constant prior variance across all space
I Covariances would still be computed by evaluating a Grammatrix (and need not be constant)
Function space view
I The argument before shows that we can put a prior overinfinite-dimensional spaces of functions s.t. all finitedimensional marginals are multivariate Gaussian
I The constructive argument, often referred to as weightsspace view, is useful for intuition but impractical
I It does demonstrate the existence of truly infinitedimensional Gaussian processes
I Once we accept that Gaussian processes exist, we arebetter off proceeding along a more abstract line
Gaussian Processes
Gaussian Processes are a natural extension of the multivariatenormal distribution to infinite dimensional spaces of functions.
A GP is a probability measure over the space of continuousfunctions (over a suitable input space) such that the randomvector obtained by evaluating a sample function at a finite set ofpoints follows a multivariate normal distribution.
A GP is uniquely defined by its mean and covariance functions,denoted by µ(x) and k(x , x ′):
f ∼ GP(µ, k)↔ f = (f (x1), . . . , f (xN)) ∼ N (µ,K ) ,
µ = (µ(x1), . . . .µ(xN)), K = (k(xi , xj))i ,j
Gaussian Processes
Gaussian Processes are a natural extension of the multivariatenormal distribution to infinite dimensional spaces of functions.
A GP is a probability measure over the space of continuousfunctions (over a suitable input space) such that the randomvector obtained by evaluating a sample function at a finite set ofpoints follows a multivariate normal distribution.
A GP is uniquely defined by its mean and covariance functions,denoted by µ(x) and k(x , x ′):
f ∼ GP(µ, k)↔ f = (f (x1), . . . , f (xN)) ∼ N (µ,K ) ,
µ = (µ(x1), . . . .µ(xN)), K = (k(xi , xj))i ,j
Gaussian Processes
Gaussian Processes are a natural extension of the multivariatenormal distribution to infinite dimensional spaces of functions.
A GP is a probability measure over the space of continuousfunctions (over a suitable input space) such that the randomvector obtained by evaluating a sample function at a finite set ofpoints follows a multivariate normal distribution.
A GP is uniquely defined by its mean and covariance functions,denoted by µ(x) and k(x , x ′):
f ∼ GP(µ, k)↔ f = (f (x1), . . . , f (xN)) ∼ N (µ,K ) ,
µ = (µ(x1), . . . .µ(xN)), K = (k(xi , xj))i ,j
The Radial Basis Function Kernel
The kernel function is the most important ingredient of GP (themean function is usually taken to be zero).
Radial Basis Function kernel
k(x , x ′) = γ exp[−‖x − x ′‖2
λ2
]It depends on two hyper-parameters, the amplitude γ and thelengthscale λ. Sample functions from a GP with RBFcovariance are with probability 1 infinitely differentiablefunctions.
FactAny smooth function can be approximated to arbitrary precisionby a sample from a GP with RBF covariance
The Radial Basis Function Kernel
The kernel function is the most important ingredient of GP (themean function is usually taken to be zero).
Radial Basis Function kernel
k(x , x ′) = γ exp[−‖x − x ′‖2
λ2
]It depends on two hyper-parameters, the amplitude γ and thelengthscale λ. Sample functions from a GP with RBFcovariance are with probability 1 infinitely differentiablefunctions.
FactAny smooth function can be approximated to arbitrary precisionby a sample from a GP with RBF covariance
Bayesian prediction with GPs
I The joint probability of function values at a set of points ismultivariate Gaussian
I If I have observations yi i = 1, . . . ,N of the function atinputs xi , what can I say of the function value at a newpoint x∗?
I By Bayes’ theorem, we have
p(f∗|y) ∝
∫df (x)p(f∗, f (x))p(y|f (x)) (1)
where f (x) is the vector of function values at the inputpoints
I If p(y|f (x)) is Gaussian, then we have a regression taskand the integral in (1) can be computed analytically
Regression calculations
I By the definition of GPs we know that the joint p(fnew , f (x))in equation (1) is a zero-mean multivariate Gaussian withcovariance
Σ =
(k∗∗ k∗k∗ K
)(2)
where k∗∗ = k(x∗, x∗), k∗j = k(x∗, xj) and Kij = k(xi , xj)
I Supposing the observations are i.i.d. Gaussianp(y|f (x)) = N(0, σ2I) we can obtain the joint distribution forthe observations and the new value
p(f∗,y) = N(0,Σy) Σy =
(k∗∗ k∗k∗ K + σ2I
)(3)
Regression calculations
I By the definition of GPs we know that the joint p(fnew , f (x))in equation (1) is a zero-mean multivariate Gaussian withcovariance
Σ =
(k∗∗ k∗k∗ K
)(2)
where k∗∗ = k(x∗, x∗), k∗j = k(x∗, xj) and Kij = k(xi , xj)
I Supposing the observations are i.i.d. Gaussianp(y|f (x)) = N(0, σ2I) we can obtain the joint distribution forthe observations and the new value
p(f∗,y) = N(0,Σy) Σy =
(k∗∗ k∗k∗ K + σ2I
)(3)
Regression calcs cont’d
I Conditioning on observations means fixing the values ofthe variables y to the actual measurements in equation (3)
I The posterior distribution is then obtained by completingthe square and using the partitioned inverse formula (seee.g. Rasmussen and Williams, Gaussian Processes forMachine learning, MIT press 2006)
I The final result is that p(f∗|y) = N(m, v) with
m = kT∗ Σ−1
y y v = k∗∗ − kT∗ Σ−1
y k∗ (4)
Observations
I Equations (4) provide the predictive distribution analyticallyat all points in space
I The expectation is always a linear combination of theobservations; far away points contribute less to theprediction (being multiplied by the fast decaying term k∗j
I Addition of a new observation always reduces uncertaintyat all points→ vulnerable to outliers
I MAIN PROBLEM: GP prediction relies on a matrixinversion which scales cubically with the number of points!
I Sparsification methods have been proposed but in highdimension GP regression is likely to be tricky nevertheless
GP regression - example
X0 0.5 1 1.5 2 2.5 3 3.5 4
Y
-1
0
1
2
3
4
5True functionGP predictioncb 95%cb 95%observation
Smoothed model checking: An intuitive view
parameter
probability
0
1
Smoothed model checking: An intuitive view
parameter
probability
0
1
Smoothed model checking: An intuitive view
parameter
probability
0
1
GP prediction from binary observations
I Equation (1) holds whatever p(y|f (x)).I In the model checking case, we have that the truth of a
formula at a specified parameter value is a booleanvariable following a Bernoulli distribution
I We can encode this in a GP prediction mechanism andwork directly with (few) truth observations at a set ofparameter values
I The integral in (1) is no longer analytically computable, weuse Expectation-Propagation (EP) an established fastapproximate inference method
Smoothed model checking
We call the following statistical model checking algorithmSmoothed model checking
1. Input a set of parameter values and a number ofobservations per parameter value
2. Perform (approximate) GP prediction to obtain estimates ofsatisfaction function and uncertainties
3. If uncertainty is too high, increase the resolution of theparameter grid/ number of observations and repeat.
4. Return estimated satisfaction function and uncertainties
Network Epidemics
S
RIinf
ext
patch
loss
We investigate a SIR-like network epidemics model, with anetwork of 100 nodes having three states: susceptible (XS),infected (XI), and patched (XR). The dynamics is described bythe following reactions:
External infection: Ske−−→ I, with rate function keXS;
Internal infection: S + Iki−→ I + I, with rate function kiXSXI ;
Patching: Ikr−→ R, with rate function kr XI ;
Immunity loss: Rks−−→ S, with rate function ksXR;
Example: Epidemics
SIR modelWe investigate the dependence of truth probability of the property
ϕ = (XI > 0)U[100,120](XI = 0)
on the infection rate: kI and recovery rate: kR . We use ten simulationsat each of ten parameter values.
0 0.05 0.1 0.15 0.2 0.25 0.30
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
infection rate
prob
abilit
y
0 0.05 0.1 0.15 0.20
0.05
0.1
0.15
0.2
0.25
recovery rate
prob
abilit
y
(left: function of kI , right: function of kR . Blue dots, SMC for 5000 runsper parameter value)
Example: Epidemics
SIR modelWe investigate the dependence of truth probability of the property
ϕ = (XI > 0)U[100,120](XI = 0)
on the infection rate: kI and recovery rate: kR . We use ten simulationsat each of ten parameter values.
0 0.05 0.1 0.15 0.2 0.25 0.30
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
infection rate
prob
abilit
y
0 0.05 0.1 0.15 0.20
0.05
0.1
0.15
0.2
0.25
recovery rate
prob
abilit
y
(left: function of kI , right: function of kR . Blue dots, SMC for 5000 runsper parameter value)
Example: Epidemics
SIR model - two free parametersWe estimate p(ϕ | kI , kR) from 2560 traces for the same property.
ϕ = (XI > 0)U[100,120](XI = 0)
0
0.1
0.2
0.3
0.4
0.5
00.050.10.150.20
0.1
0.2
0.3
0.4
0.5
ki
kr
prob
0
0.1
0.2
0.3
0.4
0.5
00.050.10.150.20
0.1
0.2
0.3
0.4
ki
kr
prob
(left: smoothed MC, right SMC on 16x16 points at 5000 runs perpoint. Time SMC= 580× smoothed MC)
Example: Epidemics
SIR model - two free parametersWe estimate p(ϕ | kI , kR) from 2560 traces for the same property.
ϕ = (XI > 0)U[100,120](XI = 0)
0
0.1
0.2
0.3
0.4
0.5
00.050.10.150.20
0.1
0.2
0.3
0.4
0.5
ki
kr
prob
0
0.1
0.2
0.3
0.4
0.5
00.050.10.150.20
0.1
0.2
0.3
0.4
ki
kr
prob
(left: smoothed MC, right SMC on 16x16 points at 5000 runs perpoint. Time SMC= 580× smoothed MC)
Example: LacZ operon
RbsRibosome
Ribosome
PLac
PLacRNAP
RNAP
TrLacZ1
TrLacZ2
RbsLacZ
TrRbsLacZLacZ
dgrRbsLacZ
dgrLacZ
r_1, r_2 r_3
r_4r_5
r_6, r_7
r_8
r_9
r_10
r_11
Transcription
Translation
Stochastic model of celebrated regulatory circuit in E. colianabolism (Kierzek,2002).
Example: LacZ operon
02000
40006000
800010000
0
100
200
300
400
500
0
0.2
0.4
0.6
0.8
1
k2k7
Pro
babi
lity
02000
40006000
800010000
0
100
200
300
400
500
0
0.2
0.4
0.6
0.8
1
k2k7
Pro
babi
lity
Model checking the probability of bursting expression
ϕ = F[16000,21000](∆XLacZ > 0 ∧ G[10,2000]∆XLacZ ≤ 0
)as we vary k2 and k7. Left smoothed MC, right SMC (100 drawsper parameter value).
Quantitative evaluation
Table: RMSE for the expression burst formula (LacZ model); the k2and k7 parameters have been explored simultaneously. A grid of 256parameter values and a varying number of observations per valuehas been used. The true values were approximated via SMC using2000 simulation runs. The MSE values presented are the average of5 independent experiment iterations.
Obs. per valueRMSE
Bayesian SMCSmoothed MC
100 points 256 points
5 0.1819 ± 0.017 0.0513 ± 0.033 0.0390 ± 0.01110 0.1246 ± 0.008 0.0412 ± 0.006 0.0322 ± 0.00320 0.0954 ± 0.010 0.0375 ± 0.011 0.0248 ± 0.00350 0.0588 ± 0.005 0.0266 ± 0.005 0.0200 ± 0.004
100 0.0427 ± 0.006 0.0200 ± 0.002 0.0159 ± 0.001200 0.0309 ± 0.003 0.0171 ± 0.002 0.0141 ± 0.002
Outline
Motivation and backgroundProperties and qualitative dataCTMCs and model checking
Smoothed model checkingUncertainty and smoothnessGaussian processesSmoothed model checking results
Robust system design from formal specificationSignal Temporal LogicGP-UCB optimisationExperimental Results
U-check
Robust Design with Temporal Logic
Qualitative dataWe consider a set of system requirements specified as (linear)temporal logic formulae.
Problem: Robust Synthesis/ DesignFind the set of model parameters that such that the modelsatisfies the requirements as robustly as possible.
Notion of robustnessThis requires a proper notion of robustness of satisfaction for atemporal logic formula.
Robust Design with Temporal Logic
Qualitative dataWe consider a set of system requirements specified as (linear)temporal logic formulae.
Problem: Robust Synthesis/ DesignFind the set of model parameters that such that the modelsatisfies the requirements as robustly as possible.
Notion of robustnessThis requires a proper notion of robustness of satisfaction for atemporal logic formula.
Robust Design with Temporal Logic
Qualitative dataWe consider a set of system requirements specified as (linear)temporal logic formulae.
Problem: Robust Synthesis/ DesignFind the set of model parameters that such that the modelsatisfies the requirements as robustly as possible.
Notion of robustnessThis requires a proper notion of robustness of satisfaction for atemporal logic formula.
Signal Temporal Logic
STL is metric linear time logic for real-valued signals.
STL SyntaxGiven a (primary) real-valued signal x [t ] = (x1[t ], ..., xn[t ]),t ∈ R>0, xi ∈ R, the STL syntax is given by
ϕ := µ | ¬ϕ | ϕ1 ∧ ϕ2 | ϕ1 U[a,b] ϕ2
I µ : Rn → B is an atomic predicate s.t. µ(x) := (y(x) > 0),I y : Rn → R a real-valued function, the secondary signal.
As usual, F[a,b]ϕ := ttU[a,b]ϕ and G[a,b]ϕ := ¬F[a,b]¬ϕ.
Signal Temporal Logic
STL is metric linear time logic for real-valued signals.
STL SyntaxGiven a (primary) real-valued signal x [t ] = (x1[t ], ..., xn[t ]),t ∈ R>0, xi ∈ R, the STL syntax is given by
ϕ := µ | ¬ϕ | ϕ1 ∧ ϕ2 | ϕ1 U[a,b] ϕ2
I µ : Rn → B is an atomic predicate s.t. µ(x) := (y(x) > 0),
I y : Rn → R a real-valued function, the secondary signal.
As usual, F[a,b]ϕ := ttU[a,b]ϕ and G[a,b]ϕ := ¬F[a,b]¬ϕ.
Signal Temporal Logic
STL is metric linear time logic for real-valued signals.
STL SyntaxGiven a (primary) real-valued signal x [t ] = (x1[t ], ..., xn[t ]),t ∈ R>0, xi ∈ R, the STL syntax is given by
ϕ := µ | ¬ϕ | ϕ1 ∧ ϕ2 | ϕ1 U[a,b] ϕ2
I µ : Rn → B is an atomic predicate s.t. µ(x) := (y(x) > 0),I y : Rn → R a real-valued function, the secondary signal.
As usual, F[a,b]ϕ := ttU[a,b]ϕ and G[a,b]ϕ := ¬F[a,b]¬ϕ.
Signal Temporal Logic
STL is metric linear time logic for real-valued signals.
STL SyntaxGiven a (primary) real-valued signal x [t ] = (x1[t ], ..., xn[t ]),t ∈ R>0, xi ∈ R, the STL syntax is given by
ϕ := µ | ¬ϕ | ϕ1 ∧ ϕ2 | ϕ1 U[a,b] ϕ2
I µ : Rn → B is an atomic predicate s.t. µ(x) := (y(x) > 0),I y : Rn → R a real-valued function, the secondary signal.
As usual, F[a,b]ϕ := ttU[a,b]ϕ and G[a,b]ϕ := ¬F[a,b]¬ϕ.
Signal Temporal Logic
STL Quantitative SemanticsThe quantitative satisfaction function ρ is defined by
ρ(µ, x , t) = y(x [t ]) where µ ≡ (y(x [t ]) > 0)
ρ(¬ϕ, x , t) = − ρ(ϕ, x , t)
ρ(ϕ1 ∧ ϕ2, x , t) = min(ρ(ϕ1, x , t), ρ(ϕ2, x , t))
ρ(ϕ1 U[a,b)ϕ2, x , t) = maxt ′∈t+[a,b]
(min(ρ(ϕ2, x , t ′)), mint ′′∈[t ,t ′]
(ρ(ϕ1, x , t ′′))).
This satisfaction score can be computed efficiently forpiecewise linear signals, see the Breach Matlab Toolbox
Satisfability of Stochastic ModelsThe boolean semantics of an STL formula ϕ can be easilyextended to stochastic models.as customary, by measuring the probability of the set oftrajectories of the CTMC that satisfy the formula:
P(ϕ) = P{x | x |= ϕ}.
I The stochastic process X(t) is interpreted as a randomvariable X over the space of trajectories D.
I The set of trajectories that satisfy an STL formula ϕ can bethought as a measurable function
Iϕ : D → {0,1}, s.t. Iϕ(x) = 1 ⇔ x |= ϕ.
Hence,
P(Iϕ(X) = 1) = P({x ∈ D | Iϕ(x) = 1}) = P{x | x |= ϕ} = P(ϕ)
P(Iϕ(X) = 1) = P({x ∈ D | Iϕ(x) = 1}) = P(I−1ϕ (1))
Satisfability of Stochastic ModelsThe boolean semantics of an STL formula ϕ can be easilyextended to stochastic models.as customary, by measuring the probability of the set oftrajectories of the CTMC that satisfy the formula:
P(ϕ) = P{x | x |= ϕ}.
I The stochastic process X(t) is interpreted as a randomvariable X over the space of trajectories D.
I The set of trajectories that satisfy an STL formula ϕ can bethought as a measurable function
Iϕ : D → {0,1}, s.t. Iϕ(x) = 1 ⇔ x |= ϕ.
Hence,
P(Iϕ(X) = 1) = P({x ∈ D | Iϕ(x) = 1}) = P{x | x |= ϕ} = P(ϕ)
P(Iϕ(X) = 1) = P({x ∈ D | Iϕ(x) = 1}) = P(I−1ϕ (1))
Satisfability of Stochastic ModelsThe boolean semantics of an STL formula ϕ can be easilyextended to stochastic models.as customary, by measuring the probability of the set oftrajectories of the CTMC that satisfy the formula:
P(ϕ) = P{x | x |= ϕ}.
I The stochastic process X(t) is interpreted as a randomvariable X over the space of trajectories D.
I The set of trajectories that satisfy an STL formula ϕ can bethought as a measurable function
Iϕ : D → {0,1}, s.t. Iϕ(x) = 1 ⇔ x |= ϕ.
Hence,
P(Iϕ(X) = 1) = P({x ∈ D | Iϕ(x) = 1}) = P{x | x |= ϕ} = P(ϕ)
P(Iϕ(X) = 1) = P({x ∈ D | Iϕ(x) = 1}) = P(I−1ϕ (1))
Satisfability of Stochastic ModelsThe boolean semantics of an STL formula ϕ can be easilyextended to stochastic models.as customary, by measuring the probability of the set oftrajectories of the CTMC that satisfy the formula:
P(ϕ) = P{x | x |= ϕ}.
I The stochastic process X(t) is interpreted as a randomvariable X over the space of trajectories D.
I The set of trajectories that satisfy an STL formula ϕ can bethought as a measurable function
Iϕ : D → {0,1}, s.t. Iϕ(x) = 1 ⇔ x |= ϕ.
Hence,
P(Iϕ(X) = 1) = P({x ∈ D | Iϕ(x) = 1}) = P{x | x |= ϕ} = P(ϕ)
P(Iϕ(X) = 1) = P({x ∈ D | Iϕ(x) = 1}) = P(I−1ϕ (1))
Robustness of Stochastic Models
The quantitative satisfaction function
ρ(ϕ,x) : D → R
(with respect to the σ-algebra induced from the Skorokhodtopology in D),
induces a real-valued random variable Rϕ(X) with probabilitydistribution
P(Rϕ(X) ∈ [a,b]
)= P (X ∈ {x ∈ D | ρ(ϕ,x,0) ∈ [a,b]})
Indicators:I E(Rϕ) (The average robustness degree)I E(Rϕ | Rϕ > 0) and E(Rϕ | Rϕ < 0) (The conditional
average)
Robustness of Stochastic Models
The quantitative satisfaction function
ρ(ϕ,x) : D → R
(with respect to the σ-algebra induced from the Skorokhodtopology in D),
induces a real-valued random variable Rϕ(X) with probabilitydistribution
P(Rϕ(X) ∈ [a,b]
)= P (X ∈ {x ∈ D | ρ(ϕ,x,0) ∈ [a,b]})
Indicators:I E(Rϕ) (The average robustness degree)I E(Rϕ | Rϕ > 0) and E(Rϕ | Rϕ < 0) (The conditional
average)
Schlögl systemCTMC model of a Schlögl system.Biochemical reactions of the Schlögl model:
Reaction rate constant init popA + 2X → 3X k1 = 3 · 10−7 X (0) = 2473X → A + 2X k2 = 1 · 10−4 A(0) = 105
B → X k3 = 1 · 10−3 B(0) = 2 · 105
X → B k4 = 3.5
Simulation of the Schloglmodel (100 runs):
starting close to theboundary of the basin ofattraction, the bistablebehaviour is evident
X
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21time
0
50
100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
values
Schlögl systemCTMC model of a Schlögl system.Biochemical reactions of the Schlögl model:
Reaction rate constant init popA + 2X → 3X k1 = 3 · 10−7 X (0) = 2473X → A + 2X k2 = 1 · 10−4 A(0) = 105
B → X k3 = 1 · 10−3 B(0) = 2 · 105
X → B k4 = 3.5
Simulation of the Schloglmodel (100 runs):
starting close to theboundary of the basin ofattraction, the bistablebehaviour is evident
X
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21time
0
50
100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
values
Schlögl system
STL formula : ϕ : F[0,T1]G[0,T2](X ≥ kt ) kt = 300
−300 −200 −100 0 100 200 3000
500
1000
1500
2000
2500
3000
3500
robustness degree
frequ
ency
Statistical estimation of ϕ: p = 0.4583 (10000 runs, error ±0.02at 95% confidence level).
Schlögl system
Satisfaction probability versus average robustness degree
0 0.2 0.4 0.6 0.8 1−400
−300
−200
−100
0
100
200
300
satisfaction probability
robu
stne
ss d
egre
e
Varying threshold kt
Correlation around 0.8386,dependency seems to follow a
sigmoid shaped curve.
0 0.2 0.4 0.6 0.8 1−300
−200
−100
0
100
200
300
satisfaction probability
robu
stne
ss d
egre
e
Varying k3
Correlation around 0.9718 with anevident linear trend.
The System Design Problem
“Given a stochastic model, depending on a set of parameters θ ∈ K ,and a specification ϕ (STL formula), find the parameter combinationθ∗ s.t. the system satisfies ϕ as robustly as possible”.
Solution strategy:I rephrase it as an unconstrained optimisation problem: we
maximise the expected robustness degree.
I evaluate the function to optimise using statistical modelchecking with a fixed number of runs
I solve the optimisation problem using the GaussianProcess-Upper Confidence Bound optimisation (GP-UCB)
The System Design Problem
“Given a stochastic model, depending on a set of parameters θ ∈ K ,and a specification ϕ (STL formula), find the parameter combinationθ∗ s.t. the system satisfies ϕ as robustly as possible”.
Solution strategy:I rephrase it as an unconstrained optimisation problem: we
maximise the expected robustness degree.
I evaluate the function to optimise using statistical modelchecking with a fixed number of runs
I solve the optimisation problem using the GaussianProcess-Upper Confidence Bound optimisation (GP-UCB)
The System Design Problem
“Given a stochastic model, depending on a set of parameters θ ∈ K ,and a specification ϕ (STL formula), find the parameter combinationθ∗ s.t. the system satisfies ϕ as robustly as possible”.
Solution strategy:I rephrase it as an unconstrained optimisation problem: we
maximise the expected robustness degree.
I evaluate the function to optimise using statistical modelchecking with a fixed number of runs
I solve the optimisation problem using the GaussianProcess-Upper Confidence Bound optimisation (GP-UCB)
The System Design Problem
“Given a stochastic model, depending on a set of parameters θ ∈ K ,and a specification ϕ (STL formula), find the parameter combinationθ∗ s.t. the system satisfies ϕ as robustly as possible”.
Solution strategy:I rephrase it as an unconstrained optimisation problem: we
maximise the expected robustness degree.
I evaluate the function to optimise using statistical modelchecking with a fixed number of runs
I solve the optimisation problem using the GaussianProcess-Upper Confidence Bound optimisation (GP-UCB)
The optimisation problem
I We need to maximise the expected robustness.
I Each evaluation of this function is costly (we obtain it bySMC, need to run SSA many times).
I Each evaluation of this function is noisy (we estimate thevalue by SMC - noise is approximatively gaussian).
We need to maximise an unknown function, which we canobserve with noise, with the minimal number of evaluations.
We use the GP-UCB algorithm (Srinivas et al 2012)
The optimisation problem
I We need to maximise the expected robustness.I Each evaluation of this function is costly (we obtain it by
SMC, need to run SSA many times).
I Each evaluation of this function is noisy (we estimate thevalue by SMC - noise is approximatively gaussian).
We need to maximise an unknown function, which we canobserve with noise, with the minimal number of evaluations.
We use the GP-UCB algorithm (Srinivas et al 2012)
The optimisation problem
I We need to maximise the expected robustness.I Each evaluation of this function is costly (we obtain it by
SMC, need to run SSA many times).I Each evaluation of this function is noisy (we estimate the
value by SMC - noise is approximatively gaussian).
We need to maximise an unknown function, which we canobserve with noise, with the minimal number of evaluations.
We use the GP-UCB algorithm (Srinivas et al 2012)
The optimisation problem
I We need to maximise the expected robustness.I Each evaluation of this function is costly (we obtain it by
SMC, need to run SSA many times).I Each evaluation of this function is noisy (we estimate the
value by SMC - noise is approximatively gaussian).
We need to maximise an unknown function, which we canobserve with noise, with the minimal number of evaluations.
We use the GP-UCB algorithm (Srinivas et al 2012)
The optimisation problem
I We need to maximise the expected robustness.I Each evaluation of this function is costly (we obtain it by
SMC, need to run SSA many times).I Each evaluation of this function is noisy (we estimate the
value by SMC - noise is approximatively gaussian).
We need to maximise an unknown function, which we canobserve with noise, with the minimal number of evaluations.
We use the GP-UCB algorithm (Srinivas et al 2012)
The GP-UCB algorithm
Basic IdeaUse GP regression to emulate the unknown function, and toexplore the region near the maximum of the posterior mean.Doing this naively⇒ trapped in local optima.
Balance Exploration and ExploitationMaximise an upper quantile of the distribution, obtained asmean value plus a constant times the standard deviation:
xt+1 = argmaxx [µt (x) + βtvart (x)]
Then xt is added to the observation points.
The algorithm has a convergence guarantee in terms of regretbounds (for slowly increasing βt ).
The GP-UCB algorithm
Basic IdeaUse GP regression to emulate the unknown function, and toexplore the region near the maximum of the posterior mean.Doing this naively⇒ trapped in local optima.
Balance Exploration and ExploitationMaximise an upper quantile of the distribution, obtained asmean value plus a constant times the standard deviation:
xt+1 = argmaxx [µt (x) + βtvart (x)]
Then xt is added to the observation points.
The algorithm has a convergence guarantee in terms of regretbounds (for slowly increasing βt ).
The GP-UCB algorithm
Basic IdeaUse GP regression to emulate the unknown function, and toexplore the region near the maximum of the posterior mean.Doing this naively⇒ trapped in local optima.
Balance Exploration and ExploitationMaximise an upper quantile of the distribution, obtained asmean value plus a constant times the standard deviation:
xt+1 = argmaxx [µt (x) + βtvart (x)]
Then xt is added to the observation points.
The algorithm has a convergence guarantee in terms of regretbounds (for slowly increasing βt ).
The GP-UCB algorithm - example
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 310
12
14
16
18
20
22
24
26
28
30
The GP-UCB algorithm - example
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 310
12
14
16
18
20
22
24
26
28
30
The GP-UCB algorithm - example
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 310
12
14
16
18
20
22
24
26
28
30
The GP-UCB algorithm - example
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 310
12
14
16
18
20
22
24
26
28
30
The GP-UCB algorithm - example
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 310
12
14
16
18
20
22
24
26
28
30
The GP-UCB algorithm - example
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 310
12
14
16
18
20
22
24
26
28
30
Experimental Results (Schlögl System)Statistics of the results of ten experiments to optimize the parameterk3 , for ϕ : F[0,T1]G[0,T2](X ≥ kt ), in the range [50,1000]:
Parameter mean Parameter range Mean probabilityk3 = 997.78 [979.31 999.99] 1
Average Robustness Number of function evaluations Number of simulation runs348.97 34.4 3440
The emulated robustness function in the optimisation of k3
100 200 300 400 500 600 700 800 900 1000−3000
−2500
−2000
−1500
−1000
−500
0
500
k3
robu
stne
ss d
egre
e
Experimental Results (Schlögl System)
The distribution of the robustness score for ϕ : F[0,T1]G[0,T2](X ≥ kt )
with k3 = 999.99, T1 = 10, T2 = 15 and kt = 300
250 300 350 4000
100
200
300
400
500
600
700
800
900
robustness degree
U-check
I Open-source java implementation of smoothed modelchecking and learning/ designing from qualitative data
I Takes as input models specified as PRISM, BioPEPA orSimHyA
I Outputs results readable/ analysable with MATLAB orGNUplot
I Java source code available athttps://github.com/dmilios/U-check
Other related things
I Learning parametrisations from formulae truthobservations (QEST 2013)
I Learning parametric formulae from time series data (withE. Bartocci, FORMATS 2014)
I Spatio-temporal generalisations of robust system design (with E. Bartocci and L. Nenzi, HSB 2015)
I Statistical abstractions of stiff dynamical systems forefficient simulations (CMSB 2015)
Conclusions
I Uncertainty has implications also for formal analysistechniques
I When there is uncertainty, machine learning is likely to beof use
I We proposed methods relying on Gaussian Processes toefficiently compute the probability of a certain property asa function of the parameters, and to train models fromqualitative observations.
I Similar ideas can be applied to other performance metrics,such as expected rewards.
I We have just released a tool: U-check.
Conclusions
I Uncertainty has implications also for formal analysistechniques
I When there is uncertainty, machine learning is likely to beof use
I We proposed methods relying on Gaussian Processes toefficiently compute the probability of a certain property asa function of the parameters, and to train models fromqualitative observations.
I Similar ideas can be applied to other performance metrics,such as expected rewards.
I We have just released a tool: U-check.
Conclusions
I Uncertainty has implications also for formal analysistechniques
I When there is uncertainty, machine learning is likely to beof use
I We proposed methods relying on Gaussian Processes toefficiently compute the probability of a certain property asa function of the parameters, and to train models fromqualitative observations.
I Similar ideas can be applied to other performance metrics,such as expected rewards.
I We have just released a tool: U-check.
Conclusions
I Uncertainty has implications also for formal analysistechniques
I When there is uncertainty, machine learning is likely to beof use
I We proposed methods relying on Gaussian Processes toefficiently compute the probability of a certain property asa function of the parameters, and to train models fromqualitative observations.
I Similar ideas can be applied to other performance metrics,such as expected rewards.
I We have just released a tool: U-check.
Conclusions
I Uncertainty has implications also for formal analysistechniques
I When there is uncertainty, machine learning is likely to beof use
I We proposed methods relying on Gaussian Processes toefficiently compute the probability of a certain property asa function of the parameters, and to train models fromqualitative observations.
I Similar ideas can be applied to other performance metrics,such as expected rewards.
I We have just released a tool: U-check.
Conclusions
I Uncertainty has implications also for formal analysistechniques
I When there is uncertainty, machine learning is likely to beof use
I We proposed methods relying on Gaussian Processes toefficiently compute the probability of a certain property asa function of the parameters, and to train models fromqualitative observations.
I Similar ideas can be applied to other performance metrics,such as expected rewards.
I We have just released a tool: U-check.
The End
and thanks to the ERC!
Bibliography
I L. Bortolussi, D. Milios and G. S. Smoothed Model Checking for Uncertain Continuous Time Markov Chains.http://arxiv.org/pdf/1402.1450.pdf, to appear in Information and Computation.
I E. Bartocci, L. Bortolussi, L. Nenzi, and G. S. On the robustness of temporal properties for stochasticmodels. Electronic Proceedings in Theoretical Computer Science, 125:3–19, Aug. 2013.
I E. Bartocci, L. Bortolussi, L. Nenzi and G. S., System design of stochastic models using robustness oftemporal properties, Theor. Comp. Sci. 587,3-25.
I L. Bortolussi and G. Sanguinetti. Learning and designing stochastic processes from logical constraints. InProceedings of QEST 2013, 2013 and LMCS 2015.
I L. Bortolussi, D. Milios and G.S., U-check: Model Checking and Parameter Synthesis under Uncertainty,QEST 2015
I C. E. Rasmussen and C. K. I. Williams. Gaussian processes for machine learning. MIT Press, Cambridge,Mass., 2006.
I N. Srinivas, A. Krause, S. M. Kakade, and M. W. Seeger. Information-theoretic regret bounds for gaussianprocess optimization in the bandit setting. IEEE Transactions on Information Theory, 58(5):3250–3265, May2012.
I A. Donzé and O. Maler. Robust satisfaction of temporal logic over real-valued signals. In Proc. ofFORMATS, 2010.