aboutsummaryrefslogtreecommitdiff
path: root/Practical1/report/analysis.tex
diff options
context:
space:
mode:
authorCamil Staps2015-11-02 15:49:23 +0100
committerCamil Staps2015-11-02 15:49:23 +0100
commite4ee1f8a1e5707a4d8478a678a03ce664ae8f894 (patch)
tree7a3f3d49c54d888f7ded3e2dafb3ec03ee6f3672 /Practical1/report/analysis.tex
parentP1 report: typo; margins; copyright; lemmas / proofs on one page (diff)
Working on report
Diffstat (limited to 'Practical1/report/analysis.tex')
-rw-r--r--Practical1/report/analysis.tex147
1 files changed, 135 insertions, 12 deletions
diff --git a/Practical1/report/analysis.tex b/Practical1/report/analysis.tex
index d796f96..aae5697 100644
--- a/Practical1/report/analysis.tex
+++ b/Practical1/report/analysis.tex
@@ -1,10 +1,12 @@
\section{Complexity analysis}
\label{sec:analysis}
-Robson provides us with a very detailed analysis proving that the $\ms$ algorithm runs in $\pazocal O(\mathrm{polynomial}(n)\cdot2^{cn})$, where $c<0.296$ and with $n$ the number of vertices of the input graph \cite{robson}. However, the argument is over my head, so I will provide my own, simpler, less exact analysis here.
+Robson provides us with a very detailed analysis proving that the $\ms$ algorithm runs in $\pazocal O(\mathrm{polynomial}(n)\cdot2^{cn})$, where $c<0.296$ and with $n$ the number of vertices of the input graph \cite{robson}. However, the argument goes over my head, so I will provide my own, simpler, less exact analysis here.
We write $T_{\ms}(n)$ for the running time of $\ms$ on a graph $G$ with $|G|=n$. For $\ms^n$ we write $T_{\ms^n}(p,q)$ for the appliance of $\ms^n$ on a graph $G$ with $|G|=p$ and a set $S$ to pick $n$ vertices from with $|S|=q$. In the case of $n=1$, we write $T_{\ms^1}(p)$ instead of $T_{\ms^1}(p,2)$, because we only use the case of $q=2$.
+For the sake of brevity, I will use but not discuss here that basic graph operations such as dominance checking, edge checking, degree computation and others run in $\pazocal O(\mathrm{polynomial}(n))$. Informally, this can be checked by looking at the functions in the Java code accompanying this report.
+
\subsection{$\ms^1$: \autoref{alg:ms1}}
\label{sec:analysis:ms1}
@@ -13,17 +15,19 @@ This function is a simple case distinction. Let's therefore consider the complex
\begin{table}[h]
\centering
\renewcommand{\arraystretch}{1.3}
- \begin{tabular}{l | p{29mm} | p{19mm} | p{50mm}}
+ \begin{tabular}{r | p{29mm} | p{19mm} | p{50mm}}
Ln. & Case & Complexity & Argumentation \\\hline\hline
\ref{alg:ms1:case-d1} & $d(s_1)\leq 1$
- & $T_{\ms}(n-1)$
- & Simple recurrence \\\hline
+ & $T_{\ms}(n)$;\newline
+ $T_{\ms}(n-1)+\pazocal O(n)$
+ & Simple recurrence; see below\\\hline
\ref{alg:ms1:case-edge-d4} & $e(s_1,s_2), d(s_1)>3$
& $2T_{\ms}(n-5)$
& Both vertices have degree four, so we run $\ms$ twice on a graph with five vertices less\\\hline
\ref{alg:ms1:case-edge-d3} & $e(s_1,s_2), d(s_1)\leq3$
- & $T_{\ms}(n)$
- & Simple recurrence \\\hline
+ & $T_{\ms}(n)$;\newline
+ $3T_{\ms}(n-2)$
+ & Simple recurrence; see below\\\hline
\ref{alg:ms1:intersection} & $N(s_1)\cap N(s_2)\neq\emptyset$
& $T_{\ms^1}(n-1)$
& At least one vertex removed \\\hline
@@ -40,10 +44,18 @@ This function is a simple case distinction. Let's therefore consider the complex
& $T_{\ms}(n-3) + T_{\ms^2}(n-4,4)$
& In the first case at least three vertices removed, in the second case four, and at most four to choose two from
\end{tabular}
- \caption{Worst-case time complexity analysis of the different cases in \autoref{alg:ms1}}
+ \caption{Worst case time complexity analysis of the different cases in \autoref{alg:ms1}}
\label{tab:analysis-ms1}
\end{table}
+The case in line \ref{alg:ms1:case-d1} is problematic in the sense that the size of the graph is not reduced. However, we know that $d(s_1)\le1$, which means that there is either a strongly connected component of size $1$, namely $\{s_1\}$, or that there is a $v$, namely $s_1$, with $d(v)=1$. In the first case, the time complexity will reduce to $T_{\ms}(1)+T_{\ms}(n-1)=T_{\ms}(n-1) + \pazocal O(n)$, as we will see below in \autoref{tab:analysis-ms}. In the second case, the time complexity reduces to $T_{\ms}(n-2) < T_{\ms}(n-1) + \pazocal O(n)$.
+
+We need to do something similar for the case in line \ref{alg:ms1:case-edge-d3}. In this case, we know there is a vertex $v$ with $d(v)\le3$. This is broken down into a number of cases in \autoref{alg:ms}, of which the time complexity may be found in \autoref{tab:analysis-ms} below. It is easy to see that all of the relevant cases have a running time less than $T_{\ms}(n-2)+T_{\ms^2}(n-3,6)+T_{\ms^2}(n-1,3)$. Since $T_{\ms^2}(n-3,\dots)<T_{\ms}(n-2)$ and $T_{\ms^2}(n-1,3)<T_{\ms}(n-2)$ by \autoref{tab:analysis-ms2} below, we know that this is surely less than $3T_{\ms}(n-2)$.
+
+\begin{thmproof}[ms1-le-ms]{lemma}{There exists a $c$ for which for all $n$ it holds that $T_{\ms^1}(n)\le c\cdot T_{\ms}(n-1)$.}
+ Every case in \autoref{tab:analysis-ms1} suffices.
+\end{thmproof}
+
\subsection{$\ms^2$: \autoref{alg:ms2}}
\label{sec:analysis:ms2}
Also this is a simple case distinction. Let's again put the different cases in a table, along with their line numbers. See \autoref{tab:analysis-ms2}.
@@ -51,16 +63,127 @@ Also this is a simple case distinction. Let's again put the different cases in a
\begin{table}[h]
\centering
\renewcommand{\arraystretch}{1.3}
- \begin{tabular}{l | p{29mm} | p{19mm} | p{50mm}}
- Ln. & Case & Complexity & Argumentation \\\hline\hline
+ \begin{tabular}{r | c | p{29mm} | p{19mm} | p{40mm}}
+ Ln. & $|S|$ & Case & Complexity & Argumentation \\\hline\hline
+ \ref{alg:ms2:case-s1} & $0,1$ & All
+ & $\pazocal O(1)$
+ & Constant return value\\\hline
+ \ref{alg:ms2:case-s2-edge} & $2$ & $e(s_1,s_2)$
+ & $\pazocal O(1)$
+ & Constant return value\\\hline
+ \ref{alg:ms2:case-s2-noedge} & $2$ & Otherwise
+ & $T_{\ms}(n-2)$
+ & At least two vertices removed\\\hline
+ \ref{alg:ms2:case-s3-d0} & $3$ & $d(s_i)=0$
+ & $T_{\ms^1}(n-1)$
+ & Simple recurrence\\\hline
+ \ref{alg:ms2:case-s3-cycle} & $3$ & $s_1,s_2,s_3$ form a three-cycle
+ & $\pazocal O(1)$
+ & Constant return value\\\hline
+ \ref{alg:ms2:case-s3-path} & $3$ & $i\neq k \land e(s_i,s_j)\land e(s_i,s_k)$
+ & $T_{\ms}(n-3)$
+ & At least $s_1,s_2,s_3$ removed\\\hline
+ \ref{alg:ms2:case-s3-edge} & $3$ & $e(s_i,s_j)$
+ & $T_{\ms^1}(n-1)$
+ & At least $s_k$ removed\\\hline
+ \ref{alg:ms2:case-s3-intersection} & $3$ & $N(s_i)\cap N(s_j)\neq\emptyset$
+ & $T_{\ms^2}(n-1,2)$
+ & At least one element in the intersection which is removed\\\hline
+ \ref{alg:ms2:case-s3-d1} & $3$ & $d(s_i)=1$
+ & $T_{\ms^1}(n-2)$
+ & Both $s_i$ and its neighbour removed\\\hline
+ \ref{alg:ms2:case-s3-otherwise} & $3$ & Otherwise
+ & $T_{\ms^1}(n-3) + T_{\ms^2}(n-7,4)$
+ & Every $s_i$ has $d(s_i)\ge2$\\\hline
+ \ref{alg:ms2:case-s4-dle3} & $4$ & $d(s_i)\le3$
+ & $T_{\ms}(n)$; \newline
+ $3T_{\ms}(n-2)$
+ & Simple recurrence; see below\\\hline
+ \ref{alg:ms2:case-s4-no-dle3} & $4$ & Otherwise
+ & $T_{\ms}(n-5) + T_{\ms^2}(n-1,3)$
+ & $d(s_1)\ge4$ so at least $5$ removed in the first case; second case is a simple recurrence\\\hline
+ \ref{alg:ms2:case-otherwise} & $5+$ & All
+ & $T_{\ms}(n)$
+ & Simple recurrence
\end{tabular}
- \caption{Worst-case time complexity analysis of the different cases in \autoref{alg:ms2}}
+ \caption{Worst case time complexity analysis of the different cases in \autoref{alg:ms2}}
\label{tab:analysis-ms2}
\end{table}
-%todo
+The case in line \ref{alg:ms2:case-s4-dle3} is again problematic. However, as above for line \ref{alg:ms1:case-edge-d3} of \autoref{alg:ms1}, this reduces to $3T_{\ms}(n-2)$.
+
+Also line \ref{alg:ms2:case-otherwise} is problematic. However, in \autoref{alg:ms} we see that the only time $\ms^2$ is called with $|S|>4$ is in line \ref{alg:ms:case-d2-noedge}. In \autoref{tab:analysis-ms} below we see that this particular case $3$ vertices were already removed from the graph. We will use this below in the analysis of \autoref{alg:ms}, line \ref{alg:ms:case-d2-noedge} in \autoref{tab:analysis-ms}.
+
+\begin{thmproof}[ms2-le-ms]{lemma}{There exists a $c$ for which for all graphs $G$ and subgraphs $S$ with $|S|\le4$, we have $T_{\ms^2}(|G|,|S|)\le T_{\ms}(n-2)$.}
+ Every case in \autoref{tab:analysis-ms2} suffices. Cases with complexity $T_{\ms^1}(n-x)$ suffice by \autoref{lem:ms1-le-ms}.
+\end{thmproof}
\subsection{$\ms$: \autoref{alg:ms}}
\label{sec:analysis:ms}
-%todo
+Also this is a simple case distinction. Let's again put the different cases in a table, along with their line numbers. See \autoref{tab:analysis-ms}.
+
+\begin{table}[h]
+ \centering
+ \renewcommand{\arraystretch}{1.3}
+ \begin{tabular}{r | p{28mm} | p{20mm} | p{50mm}}
+ Ln. & Case & Complexity & Argumentation \\\hline\hline
+ \ref{alg:ms:case-component} & Multiple strongly connected components
+ & $T_{\ms}(|C|) + T_{\ms}(n-|C|)$
+ & $|C|$ is the size of some strongly connected component\\\hline
+ \ref{alg:ms:case-g1} & $|G|\le1$
+ & $\pazocal O(n)$
+ & Counting the number of vertices\\\hline
+ \ref{alg:ms:case-d1} & $d(v)=1$
+ & $T_{\ms}(n-2)$
+ & $v$ and its neighbour removed\\\hline
+ \ref{alg:ms:case-d2-edge} & $d(v)=2$, $e(w,w')$
+ & $T_{\ms}(n-3)$
+ & $v$ and its two neighbours removed\\\hline
+ \ref{alg:ms:case-d2-noedge} & $d(v)=2$, otherwise
+ & $T_{\ms}(n-6) + T_{\ms^2}(n-3,6)$; \newline
+ $T_{\ms}(n-6) + T_{\ms}(n-3)$
+ & $d(w)$ and $d(w')\ge2$; $N^2(v)$ is at most $6$ for any degree-2 vertex in a graph with maximum degree $4$; as we saw in \autoref{sec:analysis:ms2}, this reduces to the second sum\\\hline
+ \ref{alg:ms:case-d3} & $d(v)=3$
+ & $T_{\ms}(n-4) + T_{\ms^2}(n-1,3)$; \newline
+ $T_{\ms}(n-4) + T_{\ms}(n-3)$
+ & Simple recurrences; second form by \autoref{lem:ms2-le-ms}\\\hline
+ \ref{alg:ms:case-dominance} & $v>w$
+ & $T_{\ms}(n-1)$; \newline
+ $T_{\ms}(n-3) + T_{\ms}(n-5)$
+ & Simple recurrence; see below\\\hline
+ \ref{alg:ms:case-otherwise} & Otherwise
+ & $T_{\ms}(n-1) + T_{\ms}(n-5)$
+ & $d(w)\ge d(v)\le4$ so $|\iN(w)|\le5$
+ \end{tabular}
+ \caption{Worst case time complexity analysis of the different cases in \autoref{alg:ms}}
+ \label{tab:analysis-ms}
+\end{table}
+
+We saw already in \autoref{sec:algorithm:good-v:dominance} that after removing $w$ in the case of dominance of $v$ over $w$ on line \ref{alg:ms:case-dominance}, we will have a vertex with degree $3$. Therefore, the recurrence $T_{\ms}(n-1)$ in this case reduces to one of the above cases with $n:=n-1$. This is at most $T_{\ms}(n-3)+T_{\ms}(n-5)$.
+
+We can do something similar in the `Otherwise' case. We can only be in this case if there is no vertex with degree less than $4$. Since all vertices have a maximum degree of $4$, this means that every vertex has degree $4$. In both subcases that are considered in line \ref{alg:ms:case-otherwise} of \autoref{alg:ms}, vertices are removed from the graph, leaving a graph with a vertex of maximum degree $3$. If we use that knowledge, we see from the other cases in \ref{tab:analysis-ms} that the time complexity of the `Otherwise' case can be rewritten as $T_{\ms}(n-3)+T_{\ms}(n-5)+T_{\ms}(n-7)+T_{\ms}(n-9)$.
+
+\begin{thmproof}[otherwise-half-of-cases]{lemma}{Of any $k$ consequent applications of $\ms$, at most $\frac k2$ will be using the inefficient `Otherwise' case.}
+ After applying the `Otherwise' case, there will be at least one vertex with degree lower than $4$. In the iteration directly after applying the `Otherwise' case, we don't need to apply it.
+\end{thmproof}
+
+\begin{thmproof}[ms-complexity]{theorem}{\autoref{alg:ms} runs in $\pazocal O\left(2^{n/2}\right)$.}
+ We see that the time complexity of $\ms(n)$ depends on $\ms(n-c)$ with $c\ge2$. Furthermore, if $c=2$, we have $\ms(n)=\ms(n-2)$, so if $\ms(n)$ depends on $\ms(n-2)$, it depends on that \emph{alone}, giving a branching factor of $1$. Branching only occurs when $c\ge3$. Only in the `Otherwise' case, the branching factor is $4$, in the other cases it is $2$.
+
+ We then consider the search tree $T$ where we compress nodes with branching factor $1$. By \autoref{lem:otherwise-half-of-cases}, only on one of two consecutive levels can we have a branching factor of $4$ (and in the worst case this will happen as much as possible). If we then combine the levels with branching factor $4$ with the levels directly below them (having branching factor $2$), we find a search tree $T'$ with in the worst case a constant branching factor of $8$.
+
+ The original search tree $T$ has at most $\frac n3$ levels, since branching only occurs when $c\ge3$. Therefore, the compressed tree $T'$ has, in the worst case, $\frac n6$ levels (note, that this case is worse than the case of a tree with constant branching factor $2$ and $\frac n3$ levels, and therefore also worse than any case in between).
+
+ In every vertex of the search tree $T'$ we need polynomial time, since we assumed that basic graph operations run in polynomial time. Therefore, \autoref{alg:ms} runs in $$\pazocal O\left(\left(8\cdot\mathrm{polynomial}(n)\right)^{n/6}\right) = \pazocal O\left(\mathrm{polynomial}(n)^{n/2}\right) = \pazocal O\left(2^{n/2}\right).$$
+\end{thmproof}
+
+This result is, naturally, coherent with Robson's result that the algorithm runs in $\pazocal O\left(\mathrm{polynomial}(n)\cdot2^{cn}\right)$ with $c<0.296$ \cite{robson}. We have proven that the bound holds with $c\le\frac12$.
+
+\subsection{Space complexity}
+\label{sec:analysis:space}
+A graph is stored as a list of nodes, and for every edge both nodes hold one reference. Therefore, the space complexity of a single graph $G=(V,E)$ is $\pazocal O(|V|+|E|)$.
+
+As discussed in \autoref{sec:implementation}, we're using a remove-and-restore strategy. For every recurrence, no matter what the branching factor, we do not need to create new objects: the whole algorithm runs on one object. This happens at the cost of using a stack of restore information, that are lists of vertices that are removed from the graph. Since at any moment a node will either be in the graph \emph{or} on that stack (and then only once so), the space complexity of the graph object is still $\pazocal O(|V|+|E|)$.
+
+There is one small exception to this: the \texttt{findStronglyConnectedComponent} method does create a new object. Space-wise, the worst case is then the case of a graph consisting of $|V|$ strongly connected components. In this case, we need $|V|$ graph objects, giving a quadratic worst case space complexity of $\pazocal O\left(|V|\cdot(|V|+|E|)\right)$. This is not taking into regard the garbage collector. A closer look at the implementation of the \texttt{maximumIndependentSetSize} method will show that in fact at most two graph objects are needed at any point in the program. In any case, thanks to the efficiency gained by applying \autoref{lem:components}, the algorithm runs very fast on graphs with many strongly connected components, so we do not need to worry about the theoretically quadratic space complexity.