diff options
author | Camil Staps | 2015-11-02 15:49:23 +0100 |
---|---|---|
committer | Camil Staps | 2015-11-02 15:49:23 +0100 |
commit | e4ee1f8a1e5707a4d8478a678a03ce664ae8f894 (patch) | |
tree | 7a3f3d49c54d888f7ded3e2dafb3ec03ee6f3672 /Practical1/report/analysis.tex | |
parent | P1 report: typo; margins; copyright; lemmas / proofs on one page (diff) |
Working on report
Diffstat (limited to 'Practical1/report/analysis.tex')
-rw-r--r-- | Practical1/report/analysis.tex | 147 |
1 files changed, 135 insertions, 12 deletions
diff --git a/Practical1/report/analysis.tex b/Practical1/report/analysis.tex index d796f96..aae5697 100644 --- a/Practical1/report/analysis.tex +++ b/Practical1/report/analysis.tex @@ -1,10 +1,12 @@ \section{Complexity analysis} \label{sec:analysis} -Robson provides us with a very detailed analysis proving that the $\ms$ algorithm runs in $\pazocal O(\mathrm{polynomial}(n)\cdot2^{cn})$, where $c<0.296$ and with $n$ the number of vertices of the input graph \cite{robson}. However, the argument is over my head, so I will provide my own, simpler, less exact analysis here. +Robson provides us with a very detailed analysis proving that the $\ms$ algorithm runs in $\pazocal O(\mathrm{polynomial}(n)\cdot2^{cn})$, where $c<0.296$ and with $n$ the number of vertices of the input graph \cite{robson}. However, the argument goes over my head, so I will provide my own, simpler, less exact analysis here. We write $T_{\ms}(n)$ for the running time of $\ms$ on a graph $G$ with $|G|=n$. For $\ms^n$ we write $T_{\ms^n}(p,q)$ for the appliance of $\ms^n$ on a graph $G$ with $|G|=p$ and a set $S$ to pick $n$ vertices from with $|S|=q$. In the case of $n=1$, we write $T_{\ms^1}(p)$ instead of $T_{\ms^1}(p,2)$, because we only use the case of $q=2$. +For the sake of brevity, I will use but not discuss here that basic graph operations such as dominance checking, edge checking, degree computation and others run in $\pazocal O(\mathrm{polynomial}(n))$. Informally, this can be checked by looking at the functions in the Java code accompanying this report. + \subsection{$\ms^1$: \autoref{alg:ms1}} \label{sec:analysis:ms1} @@ -13,17 +15,19 @@ This function is a simple case distinction. Let's therefore consider the complex \begin{table}[h] \centering \renewcommand{\arraystretch}{1.3} - \begin{tabular}{l | p{29mm} | p{19mm} | p{50mm}} + \begin{tabular}{r | p{29mm} | p{19mm} | p{50mm}} Ln. & Case & Complexity & Argumentation \\\hline\hline \ref{alg:ms1:case-d1} & $d(s_1)\leq 1$ - & $T_{\ms}(n-1)$ - & Simple recurrence \\\hline + & $T_{\ms}(n)$;\newline + $T_{\ms}(n-1)+\pazocal O(n)$ + & Simple recurrence; see below\\\hline \ref{alg:ms1:case-edge-d4} & $e(s_1,s_2), d(s_1)>3$ & $2T_{\ms}(n-5)$ & Both vertices have degree four, so we run $\ms$ twice on a graph with five vertices less\\\hline \ref{alg:ms1:case-edge-d3} & $e(s_1,s_2), d(s_1)\leq3$ - & $T_{\ms}(n)$ - & Simple recurrence \\\hline + & $T_{\ms}(n)$;\newline + $3T_{\ms}(n-2)$ + & Simple recurrence; see below\\\hline \ref{alg:ms1:intersection} & $N(s_1)\cap N(s_2)\neq\emptyset$ & $T_{\ms^1}(n-1)$ & At least one vertex removed \\\hline @@ -40,10 +44,18 @@ This function is a simple case distinction. Let's therefore consider the complex & $T_{\ms}(n-3) + T_{\ms^2}(n-4,4)$ & In the first case at least three vertices removed, in the second case four, and at most four to choose two from \end{tabular} - \caption{Worst-case time complexity analysis of the different cases in \autoref{alg:ms1}} + \caption{Worst case time complexity analysis of the different cases in \autoref{alg:ms1}} \label{tab:analysis-ms1} \end{table} +The case in line \ref{alg:ms1:case-d1} is problematic in the sense that the size of the graph is not reduced. However, we know that $d(s_1)\le1$, which means that there is either a strongly connected component of size $1$, namely $\{s_1\}$, or that there is a $v$, namely $s_1$, with $d(v)=1$. In the first case, the time complexity will reduce to $T_{\ms}(1)+T_{\ms}(n-1)=T_{\ms}(n-1) + \pazocal O(n)$, as we will see below in \autoref{tab:analysis-ms}. In the second case, the time complexity reduces to $T_{\ms}(n-2) < T_{\ms}(n-1) + \pazocal O(n)$. + +We need to do something similar for the case in line \ref{alg:ms1:case-edge-d3}. In this case, we know there is a vertex $v$ with $d(v)\le3$. This is broken down into a number of cases in \autoref{alg:ms}, of which the time complexity may be found in \autoref{tab:analysis-ms} below. It is easy to see that all of the relevant cases have a running time less than $T_{\ms}(n-2)+T_{\ms^2}(n-3,6)+T_{\ms^2}(n-1,3)$. Since $T_{\ms^2}(n-3,\dots)<T_{\ms}(n-2)$ and $T_{\ms^2}(n-1,3)<T_{\ms}(n-2)$ by \autoref{tab:analysis-ms2} below, we know that this is surely less than $3T_{\ms}(n-2)$. + +\begin{thmproof}[ms1-le-ms]{lemma}{There exists a $c$ for which for all $n$ it holds that $T_{\ms^1}(n)\le c\cdot T_{\ms}(n-1)$.} + Every case in \autoref{tab:analysis-ms1} suffices. +\end{thmproof} + \subsection{$\ms^2$: \autoref{alg:ms2}} \label{sec:analysis:ms2} Also this is a simple case distinction. Let's again put the different cases in a table, along with their line numbers. See \autoref{tab:analysis-ms2}. @@ -51,16 +63,127 @@ Also this is a simple case distinction. Let's again put the different cases in a \begin{table}[h] \centering \renewcommand{\arraystretch}{1.3} - \begin{tabular}{l | p{29mm} | p{19mm} | p{50mm}} - Ln. & Case & Complexity & Argumentation \\\hline\hline + \begin{tabular}{r | c | p{29mm} | p{19mm} | p{40mm}} + Ln. & $|S|$ & Case & Complexity & Argumentation \\\hline\hline + \ref{alg:ms2:case-s1} & $0,1$ & All + & $\pazocal O(1)$ + & Constant return value\\\hline + \ref{alg:ms2:case-s2-edge} & $2$ & $e(s_1,s_2)$ + & $\pazocal O(1)$ + & Constant return value\\\hline + \ref{alg:ms2:case-s2-noedge} & $2$ & Otherwise + & $T_{\ms}(n-2)$ + & At least two vertices removed\\\hline + \ref{alg:ms2:case-s3-d0} & $3$ & $d(s_i)=0$ + & $T_{\ms^1}(n-1)$ + & Simple recurrence\\\hline + \ref{alg:ms2:case-s3-cycle} & $3$ & $s_1,s_2,s_3$ form a three-cycle + & $\pazocal O(1)$ + & Constant return value\\\hline + \ref{alg:ms2:case-s3-path} & $3$ & $i\neq k \land e(s_i,s_j)\land e(s_i,s_k)$ + & $T_{\ms}(n-3)$ + & At least $s_1,s_2,s_3$ removed\\\hline + \ref{alg:ms2:case-s3-edge} & $3$ & $e(s_i,s_j)$ + & $T_{\ms^1}(n-1)$ + & At least $s_k$ removed\\\hline + \ref{alg:ms2:case-s3-intersection} & $3$ & $N(s_i)\cap N(s_j)\neq\emptyset$ + & $T_{\ms^2}(n-1,2)$ + & At least one element in the intersection which is removed\\\hline + \ref{alg:ms2:case-s3-d1} & $3$ & $d(s_i)=1$ + & $T_{\ms^1}(n-2)$ + & Both $s_i$ and its neighbour removed\\\hline + \ref{alg:ms2:case-s3-otherwise} & $3$ & Otherwise + & $T_{\ms^1}(n-3) + T_{\ms^2}(n-7,4)$ + & Every $s_i$ has $d(s_i)\ge2$\\\hline + \ref{alg:ms2:case-s4-dle3} & $4$ & $d(s_i)\le3$ + & $T_{\ms}(n)$; \newline + $3T_{\ms}(n-2)$ + & Simple recurrence; see below\\\hline + \ref{alg:ms2:case-s4-no-dle3} & $4$ & Otherwise + & $T_{\ms}(n-5) + T_{\ms^2}(n-1,3)$ + & $d(s_1)\ge4$ so at least $5$ removed in the first case; second case is a simple recurrence\\\hline + \ref{alg:ms2:case-otherwise} & $5+$ & All + & $T_{\ms}(n)$ + & Simple recurrence \end{tabular} - \caption{Worst-case time complexity analysis of the different cases in \autoref{alg:ms2}} + \caption{Worst case time complexity analysis of the different cases in \autoref{alg:ms2}} \label{tab:analysis-ms2} \end{table} -%todo +The case in line \ref{alg:ms2:case-s4-dle3} is again problematic. However, as above for line \ref{alg:ms1:case-edge-d3} of \autoref{alg:ms1}, this reduces to $3T_{\ms}(n-2)$. + +Also line \ref{alg:ms2:case-otherwise} is problematic. However, in \autoref{alg:ms} we see that the only time $\ms^2$ is called with $|S|>4$ is in line \ref{alg:ms:case-d2-noedge}. In \autoref{tab:analysis-ms} below we see that this particular case $3$ vertices were already removed from the graph. We will use this below in the analysis of \autoref{alg:ms}, line \ref{alg:ms:case-d2-noedge} in \autoref{tab:analysis-ms}. + +\begin{thmproof}[ms2-le-ms]{lemma}{There exists a $c$ for which for all graphs $G$ and subgraphs $S$ with $|S|\le4$, we have $T_{\ms^2}(|G|,|S|)\le T_{\ms}(n-2)$.} + Every case in \autoref{tab:analysis-ms2} suffices. Cases with complexity $T_{\ms^1}(n-x)$ suffice by \autoref{lem:ms1-le-ms}. +\end{thmproof} \subsection{$\ms$: \autoref{alg:ms}} \label{sec:analysis:ms} -%todo +Also this is a simple case distinction. Let's again put the different cases in a table, along with their line numbers. See \autoref{tab:analysis-ms}. + +\begin{table}[h] + \centering + \renewcommand{\arraystretch}{1.3} + \begin{tabular}{r | p{28mm} | p{20mm} | p{50mm}} + Ln. & Case & Complexity & Argumentation \\\hline\hline + \ref{alg:ms:case-component} & Multiple strongly connected components + & $T_{\ms}(|C|) + T_{\ms}(n-|C|)$ + & $|C|$ is the size of some strongly connected component\\\hline + \ref{alg:ms:case-g1} & $|G|\le1$ + & $\pazocal O(n)$ + & Counting the number of vertices\\\hline + \ref{alg:ms:case-d1} & $d(v)=1$ + & $T_{\ms}(n-2)$ + & $v$ and its neighbour removed\\\hline + \ref{alg:ms:case-d2-edge} & $d(v)=2$, $e(w,w')$ + & $T_{\ms}(n-3)$ + & $v$ and its two neighbours removed\\\hline + \ref{alg:ms:case-d2-noedge} & $d(v)=2$, otherwise + & $T_{\ms}(n-6) + T_{\ms^2}(n-3,6)$; \newline + $T_{\ms}(n-6) + T_{\ms}(n-3)$ + & $d(w)$ and $d(w')\ge2$; $N^2(v)$ is at most $6$ for any degree-2 vertex in a graph with maximum degree $4$; as we saw in \autoref{sec:analysis:ms2}, this reduces to the second sum\\\hline + \ref{alg:ms:case-d3} & $d(v)=3$ + & $T_{\ms}(n-4) + T_{\ms^2}(n-1,3)$; \newline + $T_{\ms}(n-4) + T_{\ms}(n-3)$ + & Simple recurrences; second form by \autoref{lem:ms2-le-ms}\\\hline + \ref{alg:ms:case-dominance} & $v>w$ + & $T_{\ms}(n-1)$; \newline + $T_{\ms}(n-3) + T_{\ms}(n-5)$ + & Simple recurrence; see below\\\hline + \ref{alg:ms:case-otherwise} & Otherwise + & $T_{\ms}(n-1) + T_{\ms}(n-5)$ + & $d(w)\ge d(v)\le4$ so $|\iN(w)|\le5$ + \end{tabular} + \caption{Worst case time complexity analysis of the different cases in \autoref{alg:ms}} + \label{tab:analysis-ms} +\end{table} + +We saw already in \autoref{sec:algorithm:good-v:dominance} that after removing $w$ in the case of dominance of $v$ over $w$ on line \ref{alg:ms:case-dominance}, we will have a vertex with degree $3$. Therefore, the recurrence $T_{\ms}(n-1)$ in this case reduces to one of the above cases with $n:=n-1$. This is at most $T_{\ms}(n-3)+T_{\ms}(n-5)$. + +We can do something similar in the `Otherwise' case. We can only be in this case if there is no vertex with degree less than $4$. Since all vertices have a maximum degree of $4$, this means that every vertex has degree $4$. In both subcases that are considered in line \ref{alg:ms:case-otherwise} of \autoref{alg:ms}, vertices are removed from the graph, leaving a graph with a vertex of maximum degree $3$. If we use that knowledge, we see from the other cases in \ref{tab:analysis-ms} that the time complexity of the `Otherwise' case can be rewritten as $T_{\ms}(n-3)+T_{\ms}(n-5)+T_{\ms}(n-7)+T_{\ms}(n-9)$. + +\begin{thmproof}[otherwise-half-of-cases]{lemma}{Of any $k$ consequent applications of $\ms$, at most $\frac k2$ will be using the inefficient `Otherwise' case.} + After applying the `Otherwise' case, there will be at least one vertex with degree lower than $4$. In the iteration directly after applying the `Otherwise' case, we don't need to apply it. +\end{thmproof} + +\begin{thmproof}[ms-complexity]{theorem}{\autoref{alg:ms} runs in $\pazocal O\left(2^{n/2}\right)$.} + We see that the time complexity of $\ms(n)$ depends on $\ms(n-c)$ with $c\ge2$. Furthermore, if $c=2$, we have $\ms(n)=\ms(n-2)$, so if $\ms(n)$ depends on $\ms(n-2)$, it depends on that \emph{alone}, giving a branching factor of $1$. Branching only occurs when $c\ge3$. Only in the `Otherwise' case, the branching factor is $4$, in the other cases it is $2$. + + We then consider the search tree $T$ where we compress nodes with branching factor $1$. By \autoref{lem:otherwise-half-of-cases}, only on one of two consecutive levels can we have a branching factor of $4$ (and in the worst case this will happen as much as possible). If we then combine the levels with branching factor $4$ with the levels directly below them (having branching factor $2$), we find a search tree $T'$ with in the worst case a constant branching factor of $8$. + + The original search tree $T$ has at most $\frac n3$ levels, since branching only occurs when $c\ge3$. Therefore, the compressed tree $T'$ has, in the worst case, $\frac n6$ levels (note, that this case is worse than the case of a tree with constant branching factor $2$ and $\frac n3$ levels, and therefore also worse than any case in between). + + In every vertex of the search tree $T'$ we need polynomial time, since we assumed that basic graph operations run in polynomial time. Therefore, \autoref{alg:ms} runs in $$\pazocal O\left(\left(8\cdot\mathrm{polynomial}(n)\right)^{n/6}\right) = \pazocal O\left(\mathrm{polynomial}(n)^{n/2}\right) = \pazocal O\left(2^{n/2}\right).$$ +\end{thmproof} + +This result is, naturally, coherent with Robson's result that the algorithm runs in $\pazocal O\left(\mathrm{polynomial}(n)\cdot2^{cn}\right)$ with $c<0.296$ \cite{robson}. We have proven that the bound holds with $c\le\frac12$. + +\subsection{Space complexity} +\label{sec:analysis:space} +A graph is stored as a list of nodes, and for every edge both nodes hold one reference. Therefore, the space complexity of a single graph $G=(V,E)$ is $\pazocal O(|V|+|E|)$. + +As discussed in \autoref{sec:implementation}, we're using a remove-and-restore strategy. For every recurrence, no matter what the branching factor, we do not need to create new objects: the whole algorithm runs on one object. This happens at the cost of using a stack of restore information, that are lists of vertices that are removed from the graph. Since at any moment a node will either be in the graph \emph{or} on that stack (and then only once so), the space complexity of the graph object is still $\pazocal O(|V|+|E|)$. + +There is one small exception to this: the \texttt{findStronglyConnectedComponent} method does create a new object. Space-wise, the worst case is then the case of a graph consisting of $|V|$ strongly connected components. In this case, we need $|V|$ graph objects, giving a quadratic worst case space complexity of $\pazocal O\left(|V|\cdot(|V|+|E|)\right)$. This is not taking into regard the garbage collector. A closer look at the implementation of the \texttt{maximumIndependentSetSize} method will show that in fact at most two graph objects are needed at any point in the program. In any case, thanks to the efficiency gained by applying \autoref{lem:components}, the algorithm runs very fast on graphs with many strongly connected components, so we do not need to worry about the theoretically quadratic space complexity. |