aboutsummaryrefslogblamecommitdiff
path: root/Practical1/report/analysis.tex
blob: aae5697b33555c42d13a26344f731bd56bc2110a (plain) (tree)
1
2
3
4
5
6
7
8
9

                             
                                                                                                                                                                                                                                                                                                                                      

                                                                                                                                                                                                                                                                                                                                                       
                                                                                                                                                                                                                                                                                                                 






                                                                                                                                                                                                                               
                                                    
                                                              

                                                  


                                                                                                              

                                                  














                                                                                                                             
                                                                                             

                            






                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
                                       



                                                                                                                                                      








































                                                                                                                
                 
                                                                                             

                            





                                                                                                                                                                                                                                                                                                                                                                                                                                     

                                    
































































                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
 
\section{Complexity analysis}
\label{sec:analysis}

Robson provides us with a very detailed analysis proving that the $\ms$ algorithm runs in $\pazocal O(\mathrm{polynomial}(n)\cdot2^{cn})$, where $c<0.296$ and with $n$ the number of vertices of the input graph \cite{robson}. However, the argument goes over my head, so I will provide my own, simpler, less exact analysis here.

We write $T_{\ms}(n)$ for the running time of $\ms$ on a graph $G$ with $|G|=n$. For $\ms^n$ we write $T_{\ms^n}(p,q)$ for the appliance of $\ms^n$ on a graph $G$ with $|G|=p$ and a set $S$ to pick $n$ vertices from with $|S|=q$. In the case of $n=1$, we write $T_{\ms^1}(p)$ instead of $T_{\ms^1}(p,2)$, because we only use the case of $q=2$.

For the sake of brevity, I will use but not discuss here that basic graph operations such as dominance checking, edge checking, degree computation and others run in $\pazocal O(\mathrm{polynomial}(n))$. Informally, this can be checked by looking at the functions in the Java code accompanying this report.

\subsection{$\ms^1$: \autoref{alg:ms1}}
\label{sec:analysis:ms1}

This function is a simple case distinction. Let's therefore consider the complexities of all the cases one by one. The results, along with their line numbers in \autoref{alg:ms1}, may be found in \autoref{tab:analysis-ms1}.

\begin{table}[h]
    \centering
    \renewcommand{\arraystretch}{1.3}
    \begin{tabular}{r | p{29mm} | p{19mm} | p{50mm}}
        Ln. & Case & Complexity & Argumentation \\\hline\hline
        \ref{alg:ms1:case-d1}       & $d(s_1)\leq 1$
            & $T_{\ms}(n)$;\newline
              $T_{\ms}(n-1)+\pazocal O(n)$
            & Simple recurrence; see below\\\hline
        \ref{alg:ms1:case-edge-d4} & $e(s_1,s_2), d(s_1)>3$
            & $2T_{\ms}(n-5)$
            & Both vertices have degree four, so we run $\ms$ twice on a graph with five vertices less\\\hline
        \ref{alg:ms1:case-edge-d3} & $e(s_1,s_2), d(s_1)\leq3$
            & $T_{\ms}(n)$;\newline
              $3T_{\ms}(n-2)$
            & Simple recurrence; see below\\\hline
        \ref{alg:ms1:intersection} & $N(s_1)\cap N(s_2)\neq\emptyset$
            & $T_{\ms^1}(n-1)$ 
            & At least one vertex removed \\\hline
        \ref{alg:ms1:case-2-2-edge} & $d(s_1)=d(s_2)=2$, $e(w_1,w_2)$
            & $T_{\ms}(n-3)$
            & Simple recurrence \\\hline
        \ref{alg:ms1:case-2-2-dominance} & $d(s_1)=d(s_2)=2$, $N^2(s_1)\subset N(s_2)$
            & $T_{\ms}(n-6)$
            & Exactly six vertices removed \\\hline
        \ref{alg:ms1:case-2-2-else} & $d(s_1)=d(s_2)=2$, otherwise
            & $T_{\ms}(n-3) + T_{\ms}(n-7)$ 
            & In the first case three vertices removed, in the second case seven \\\hline
        \ref{alg:ms1:case-general} & Otherwise
            & $T_{\ms}(n-3) + T_{\ms^2}(n-4,4)$
            & In the first case at least three vertices removed, in the second case four, and at most four to choose two from
    \end{tabular}
    \caption{Worst case time complexity analysis of the different cases in \autoref{alg:ms1}}
    \label{tab:analysis-ms1}
\end{table}

The case in line \ref{alg:ms1:case-d1} is problematic in the sense that the size of the graph is not reduced. However, we know that $d(s_1)\le1$, which means that there is either a strongly connected component of size $1$, namely $\{s_1\}$, or that there is a $v$, namely $s_1$, with $d(v)=1$. In the first case, the time complexity will reduce to $T_{\ms}(1)+T_{\ms}(n-1)=T_{\ms}(n-1) + \pazocal O(n)$, as we will see below in \autoref{tab:analysis-ms}. In the second case, the time complexity reduces to $T_{\ms}(n-2) < T_{\ms}(n-1) + \pazocal O(n)$.

We need to do something similar for the case in line \ref{alg:ms1:case-edge-d3}. In this case, we know there is a vertex $v$ with $d(v)\le3$. This is broken down into a number of cases in \autoref{alg:ms}, of which the time complexity may be found in \autoref{tab:analysis-ms} below. It is easy to see that all of the relevant cases have a running time less than $T_{\ms}(n-2)+T_{\ms^2}(n-3,6)+T_{\ms^2}(n-1,3)$. Since $T_{\ms^2}(n-3,\dots)<T_{\ms}(n-2)$ and $T_{\ms^2}(n-1,3)<T_{\ms}(n-2)$ by \autoref{tab:analysis-ms2} below, we know that this is surely less than $3T_{\ms}(n-2)$.

\begin{thmproof}[ms1-le-ms]{lemma}{There exists a $c$ for which for all $n$ it holds that $T_{\ms^1}(n)\le c\cdot T_{\ms}(n-1)$.}
    Every case in \autoref{tab:analysis-ms1} suffices.
\end{thmproof}

\subsection{$\ms^2$: \autoref{alg:ms2}}
\label{sec:analysis:ms2}
Also this is a simple case distinction. Let's again put the different cases in a table, along with their line numbers. See \autoref{tab:analysis-ms2}.

\begin{table}[h]
    \centering
    \renewcommand{\arraystretch}{1.3}
    \begin{tabular}{r | c | p{29mm} | p{19mm} | p{40mm}}
        Ln. & $|S|$ & Case & Complexity & Argumentation \\\hline\hline
        \ref{alg:ms2:case-s1} & $0,1$ & All
            & $\pazocal O(1)$
            & Constant return value\\\hline
        \ref{alg:ms2:case-s2-edge} & $2$ & $e(s_1,s_2)$
            & $\pazocal O(1)$
            & Constant return value\\\hline
        \ref{alg:ms2:case-s2-noedge} & $2$ & Otherwise
            & $T_{\ms}(n-2)$
            & At least two vertices removed\\\hline
        \ref{alg:ms2:case-s3-d0} & $3$ & $d(s_i)=0$
            & $T_{\ms^1}(n-1)$
            & Simple recurrence\\\hline
        \ref{alg:ms2:case-s3-cycle} & $3$ & $s_1,s_2,s_3$ form a three-cycle
            & $\pazocal O(1)$
            & Constant return value\\\hline
        \ref{alg:ms2:case-s3-path} & $3$ & $i\neq k \land e(s_i,s_j)\land e(s_i,s_k)$
            & $T_{\ms}(n-3)$
            & At least $s_1,s_2,s_3$ removed\\\hline
        \ref{alg:ms2:case-s3-edge} & $3$ & $e(s_i,s_j)$
            & $T_{\ms^1}(n-1)$
            & At least $s_k$ removed\\\hline
        \ref{alg:ms2:case-s3-intersection} & $3$ & $N(s_i)\cap N(s_j)\neq\emptyset$
            & $T_{\ms^2}(n-1,2)$
            & At least one element in the intersection which is removed\\\hline
        \ref{alg:ms2:case-s3-d1} & $3$ & $d(s_i)=1$
            & $T_{\ms^1}(n-2)$
            & Both $s_i$ and its neighbour removed\\\hline
        \ref{alg:ms2:case-s3-otherwise} & $3$ & Otherwise
            & $T_{\ms^1}(n-3) + T_{\ms^2}(n-7,4)$
            & Every $s_i$ has $d(s_i)\ge2$\\\hline
        \ref{alg:ms2:case-s4-dle3} & $4$ & $d(s_i)\le3$
            & $T_{\ms}(n)$; \newline
              $3T_{\ms}(n-2)$
            & Simple recurrence; see below\\\hline
        \ref{alg:ms2:case-s4-no-dle3} & $4$ & Otherwise
            & $T_{\ms}(n-5) + T_{\ms^2}(n-1,3)$
            & $d(s_1)\ge4$ so at least $5$ removed in the first case; second case is a simple recurrence\\\hline
        \ref{alg:ms2:case-otherwise} & $5+$ & All
            & $T_{\ms}(n)$
            & Simple recurrence
    \end{tabular}
    \caption{Worst case time complexity analysis of the different cases in \autoref{alg:ms2}}
    \label{tab:analysis-ms2}
\end{table}

The case in line \ref{alg:ms2:case-s4-dle3} is again problematic. However, as above for line \ref{alg:ms1:case-edge-d3} of \autoref{alg:ms1}, this reduces to $3T_{\ms}(n-2)$. 

Also line \ref{alg:ms2:case-otherwise} is problematic. However, in \autoref{alg:ms} we see that the only time $\ms^2$ is called with $|S|>4$ is in line \ref{alg:ms:case-d2-noedge}. In \autoref{tab:analysis-ms} below we see that this particular case $3$ vertices were already removed from the graph. We will use this below in the analysis of \autoref{alg:ms}, line \ref{alg:ms:case-d2-noedge} in \autoref{tab:analysis-ms}.

\begin{thmproof}[ms2-le-ms]{lemma}{There exists a $c$ for which for all graphs $G$ and subgraphs $S$ with $|S|\le4$, we have $T_{\ms^2}(|G|,|S|)\le T_{\ms}(n-2)$.}
    Every case in \autoref{tab:analysis-ms2} suffices. Cases with complexity $T_{\ms^1}(n-x)$ suffice by \autoref{lem:ms1-le-ms}.
\end{thmproof}

\subsection{$\ms$: \autoref{alg:ms}}
\label{sec:analysis:ms}
Also this is a simple case distinction. Let's again put the different cases in a table, along with their line numbers. See \autoref{tab:analysis-ms}.

\begin{table}[h]
    \centering
    \renewcommand{\arraystretch}{1.3}
    \begin{tabular}{r | p{28mm} | p{20mm} | p{50mm}}
        Ln. & Case & Complexity & Argumentation \\\hline\hline
        \ref{alg:ms:case-component} & Multiple strongly connected components
            & $T_{\ms}(|C|) + T_{\ms}(n-|C|)$
            & $|C|$ is the size of some strongly connected component\\\hline
        \ref{alg:ms:case-g1} & $|G|\le1$
            & $\pazocal O(n)$
            & Counting the number of vertices\\\hline
        \ref{alg:ms:case-d1} & $d(v)=1$
            & $T_{\ms}(n-2)$
            & $v$ and its neighbour removed\\\hline
        \ref{alg:ms:case-d2-edge} & $d(v)=2$, $e(w,w')$
            & $T_{\ms}(n-3)$
            & $v$ and its two neighbours removed\\\hline
        \ref{alg:ms:case-d2-noedge} & $d(v)=2$, otherwise
            & $T_{\ms}(n-6) + T_{\ms^2}(n-3,6)$; \newline
              $T_{\ms}(n-6) + T_{\ms}(n-3)$
              & $d(w)$ and $d(w')\ge2$; $N^2(v)$ is at most $6$ for any degree-2 vertex in a graph with maximum degree $4$; as we saw in \autoref{sec:analysis:ms2}, this reduces to the second sum\\\hline
        \ref{alg:ms:case-d3} & $d(v)=3$
            & $T_{\ms}(n-4) + T_{\ms^2}(n-1,3)$; \newline
              $T_{\ms}(n-4) + T_{\ms}(n-3)$
            & Simple recurrences; second form by \autoref{lem:ms2-le-ms}\\\hline
        \ref{alg:ms:case-dominance} & $v>w$
            & $T_{\ms}(n-1)$; \newline
              $T_{\ms}(n-3) + T_{\ms}(n-5)$
            & Simple recurrence; see below\\\hline
        \ref{alg:ms:case-otherwise} & Otherwise
            & $T_{\ms}(n-1) + T_{\ms}(n-5)$
            & $d(w)\ge d(v)\le4$ so $|\iN(w)|\le5$
    \end{tabular}
    \caption{Worst case time complexity analysis of the different cases in \autoref{alg:ms}}
    \label{tab:analysis-ms}
\end{table}

We saw already in \autoref{sec:algorithm:good-v:dominance} that after removing $w$ in the case of dominance of $v$ over $w$ on line \ref{alg:ms:case-dominance}, we will have a vertex with degree $3$. Therefore, the recurrence $T_{\ms}(n-1)$ in this case reduces to one of the above cases with $n:=n-1$. This is at most $T_{\ms}(n-3)+T_{\ms}(n-5)$.

We can do something similar in the `Otherwise' case. We can only be in this case if there is no vertex with degree less than $4$. Since all vertices have a maximum degree of $4$, this means that every vertex has degree $4$. In both subcases that are considered in line \ref{alg:ms:case-otherwise} of \autoref{alg:ms}, vertices are removed from the graph, leaving a graph with a vertex of maximum degree $3$. If we use that knowledge, we see from the other cases in \ref{tab:analysis-ms} that the time complexity of the `Otherwise' case can be rewritten as $T_{\ms}(n-3)+T_{\ms}(n-5)+T_{\ms}(n-7)+T_{\ms}(n-9)$.

\begin{thmproof}[otherwise-half-of-cases]{lemma}{Of any $k$ consequent applications of $\ms$, at most $\frac k2$ will be using the inefficient `Otherwise' case.}
    After applying the `Otherwise' case, there will be at least one vertex with degree lower than $4$. In the iteration directly after applying the `Otherwise' case, we don't need to apply it.
\end{thmproof}

\begin{thmproof}[ms-complexity]{theorem}{\autoref{alg:ms} runs in $\pazocal O\left(2^{n/2}\right)$.}
    We see that the time complexity of $\ms(n)$ depends on $\ms(n-c)$ with $c\ge2$. Furthermore, if $c=2$, we have $\ms(n)=\ms(n-2)$, so if $\ms(n)$ depends on $\ms(n-2)$, it depends on that \emph{alone}, giving a branching factor of $1$. Branching only occurs when $c\ge3$. Only in the `Otherwise' case, the branching factor is $4$, in the other cases it is $2$.
    
    We then consider the search tree $T$ where we compress nodes with branching factor $1$. By \autoref{lem:otherwise-half-of-cases}, only on one of two consecutive levels can we have a branching factor of $4$ (and in the worst case this will happen as much as possible). If we then combine the levels with branching factor $4$ with the levels directly below them (having branching factor $2$), we find a search tree $T'$ with in the worst case a constant branching factor of $8$.
    
    The original search tree $T$ has at most $\frac n3$ levels, since branching only occurs when $c\ge3$. Therefore, the compressed tree $T'$ has, in the worst case, $\frac n6$ levels (note, that this case is worse than the case of a tree with constant branching factor $2$ and $\frac n3$ levels, and therefore also worse than any case in between).
    
    In every vertex of the search tree $T'$ we need polynomial time, since we assumed that basic graph operations run in polynomial time. Therefore, \autoref{alg:ms} runs in $$\pazocal O\left(\left(8\cdot\mathrm{polynomial}(n)\right)^{n/6}\right) = \pazocal O\left(\mathrm{polynomial}(n)^{n/2}\right) = \pazocal O\left(2^{n/2}\right).$$
\end{thmproof}

This result is, naturally, coherent with Robson's result that the algorithm runs in $\pazocal O\left(\mathrm{polynomial}(n)\cdot2^{cn}\right)$ with $c<0.296$ \cite{robson}. We have proven that the bound holds with $c\le\frac12$.

\subsection{Space complexity}
\label{sec:analysis:space}
A graph is stored as a list of nodes, and for every edge both nodes hold one reference. Therefore, the space complexity of a single graph $G=(V,E)$ is $\pazocal O(|V|+|E|)$.

As discussed in \autoref{sec:implementation}, we're using a remove-and-restore strategy. For every recurrence, no matter what the branching factor, we do not need to create new objects: the whole algorithm runs on one object. This happens at the cost of using a stack of restore information, that are lists of vertices that are removed from the graph. Since at any moment a node will either be in the graph \emph{or} on that stack (and then only once so), the space complexity of the graph object is still $\pazocal O(|V|+|E|)$.

There is one small exception to this: the \texttt{findStronglyConnectedComponent} method does create a new object. Space-wise, the worst case is then the case of a graph consisting of $|V|$ strongly connected components. In this case, we need $|V|$ graph objects, giving a quadratic worst case space complexity of $\pazocal O\left(|V|\cdot(|V|+|E|)\right)$. This is not taking into regard the garbage collector. A closer look at the implementation of the \texttt{maximumIndependentSetSize} method will show that in fact at most two graph objects are needed at any point in the program. In any case, thanks to the efficiency gained by applying \autoref{lem:components}, the algorithm runs very fast on graphs with many strongly connected components, so we do not need to worry about the theoretically quadratic space complexity.