Assignment2/report/intro.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91

\section{Introduction}
A Discrete-Time Markov Chain (DTMC) is a Kripke structure enhanced with a \emph{transition probability function} $P : S\times S\to [0,1]$,
	indicating the probability of moving from one state to another state,
	and an \emph{initial distribution} $\iota_{\text{init}} : S\to [0,1]$,
	indicating for each state the probability of starting in it~\cite[747--748]{BK}.
The transition probability function and the initial distribution should be total, i.e., the probabilities (per state) sum up to one.
A Markov Decision Process (MDP) further extends a DTMC with nondeterministic choice.
It adds a set of actions $\textsl{Act}$ and uses $P : S\times \textsl{Act}\times S\to [0,1]$ as the transition probability function~\cite[833--834]{BK}.
The transition probability function should still be total for pairs of states and actions.
A scheduler $\sigma$ on an MDP $\mathcal M$ induces the DTMC $\mathcal M^\sigma$ by choosing an action in each state,
	given the path fragment leading up to that state~\cite[842]{BK}.

Properties of DTMCs and MDPs are typically described using Probabilistic Computation Tree Logic (PCTL), an extension of CTL~\cite[780--781]{BK}.
It adds state formulae $\mathbb P_J(\phi)$, where $J$ is an interval with rational bounds and $\phi$ a path formula.
A state $s$ satisfies $\mathbb P_J(\phi)$ iff the probability of satisfying $\phi$ for a path starting in $s$ is in $J$.
PCTL also adds the bounded-until path formula $\Phi_1 \Uop^{\le n} \Phi_2$, where $n\in \mathbb N$ and $\Phi_{1,2}$ are state formulae ---
	the semantics are that $\Phi_2$ should be satisfied at most $n$ steps after $\Phi_1$.%
	\footnote{%
		Furthermore, to PCTL adds state formulae $\mathbb E_R(\phi)$ to accommodate Markov Reward Models (a Markov Chain with a reward function mapping states or transitions to natural numbers~\cite[817]{BK}).
		Here, $R$ is an interval with rational bounds, $\phi$ a path formula, and $\mathbb{E}$ a function that determines the cost of satisfying $\phi$ in $R$.
		Since $\mathbb E$ is not relevant to the present discussion, we do not discuss it in further detail.}

The Model Checking problem for a PCTL state formula $\Phi$ and a state $s$ in a DTMC is to verify whether $s \vDash \Phi$.
Model checking proceeds as for the plain CTL case, by computing the satisfaction set of the subformulas of $\Phi$.
The satisfaction set of $\mathbb P_J(\phi)$ can be computed using a linear equation system~\cite[785]{BK}.
An MDP satisfies a formula if the formula is satisfied by the induced DTMCs of all possible schedulers.
Model checking of MDPs is possible since there exist finite-memory schedulers optimising reachability properties~\cite[865--866]{BK}.

However, for complex systems, the answer to the Model Checking problem (\enquote{yes} or \enquote{no} with a counterexample) is little helpful,
	as it may not be clear at all how the model should be changed to make it satisfy the property.
For this purpose, we define the Model Repair problem as Bartocci et al. \cite[327]{Bartocci2011}:
\begin{quote}
	Given a probabilistic system $M$ and a probabilistic temporal logic formula $\phi$ such that $M$ fails to satisfy $\phi$,
		the \emph{probabilistic Model Repair problem} is to find an $M'$ that
			satisfies $\phi$ and differs from $M$ only in the transition flows of those states in $M$ that are deemed \emph{controllable}.
\end{quote}
Furthermore, we may score reparations using a \emph{cost} function,
	and want a Model Repair tool to minimise this cost.

A straightforward cost function is the mean squared error (MSE).
For all pairs of states $s_1,s_2$ in the original model $M$,
	we compute the squared error as the square of the difference of the probabilities of $s_1\to s_2$ in $M$ and the repaired model $M'$.
The MSE is the average of these individual error measures.
It does not distinguish between different transitions, but by squaring it prefers multiple minor modifications over one major modification.
The MSE is well-established in statistics.
Of course, in the usual case where we only need to consider one model with different reparations,
	the size of $M$ is fixed and we may use the sum of squared errors (SSE) as well.

Another reasonable cost function is the number of modified transitions.
This measure counts the pairs of states $s_1,s_2$ for which the probability of $s_1\to s_2$ differs between $M$ and $M'$.
The intuition is that the model is almost correct but may contain human errors, such as off-by-one errors or typos.

\begin{figure}
	\centering
	\begin{tikzpicture}[->,>=stealth,scale=0.9,every node/.style={scale=0.9,transform shape}]
		\node[state,initial,initial text={}] (s0) {$s_0$};

		\node[state,above right=of s0] (s1) {$s_1$};
		\draw (s0) -- node[above left] {$p$} ++ (s1);
		\node[state,above right=of s1] (s3) {$s_3$};
		\draw (s1) -- node[below right] {$p$} ++ (s3);
		\draw (s3) edge[bend right=45] node[above left] {$p$} (s1);
		\node[state,right=of s3,label={right:\enquote{1}}] (s7) {$s_7$};
		\draw (s3) -- node[above] {$1-p$} ++ (s7);
		\node[state,below right=of s1,yshift=3em] (s4) {$s_4$};
		\draw (s1) -- node[below,xshift=-.5em] {$1-p$} ++ (s4);
		\node[state,right=of s4,yshift=2.7em,label={right:\enquote{2}}] (s8) {$s_8$};
		\draw (s4) -- node[above] {$p$} ++ (s8);
		\node[state,right=of s4,yshift=-1em,label={right:\enquote{3}}] (s9) {$s_9$};
		\draw (s4) -- node[below] {$1-p$} ++ (s9);

		\node[state,below right=of s0] (s2) {$s_2$};
		\draw (s0) -- node[below left] {$1-p$} ++ (s2);
		\node[state,above right=of s2,yshift=-3em] (s5) {$s_5$};
		\draw (s2) -- node[above,xshift=-.5em] {$1-p$} ++ (s5);
		\node[state,right=of s5,yshift=1em,label={right:\enquote{4}}] (s10) {$s_{10}$};
		\draw (s5) -- node[above] {$p$} ++ (s10);
		\node[state,right=of s5,yshift=-2.7em,label={right:\enquote{5}}] (s11) {$s_{11}$};
		\draw (s5) -- node[below] {$1-p$} ++ (s11);
		\node[state,below right=of s2] (s6) {$s_6$};
		\draw (s2) -- node[above right] {$p$} ++ (s6);
		\draw (s6) edge[bend left=45] node[below left] {$p$} (s2);
		\node[state,right=of s6,label={right:\enquote{6}}] (s12) {$s_{12}$};
		\draw (s6) -- node[above] {$1-p$} ++ (s12);
	\end{tikzpicture}
	\caption{The Knuth-Yao die model~\cite{KnuthYao1976}.\label{fig:knuth-yao}}
\end{figure}

The present paper discusses a straightforward method to build a Model Repair pipeline using a Model Checking tool and an SMT solver.
The authors' implementation uses Storm~\cite{Storm} and Z3~\cite{Z3}, but the technique is agnostic as to what tool is used.
As a running example, we will be using the well-known Knuth-Yao die~\cite{KnuthYao1976}, using a potentially unfair coin (see \cref{fig:knuth-yao}).