Practical2/report/algorithm.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107

\section{Algorithm}
\label{sec:algorithm}

I will propose a solution that uses dynamic programming. First, I will describe the general structure of the algorithm, and simultaneously argue its correctness. After this, we will discuss some optimisations.

\subsection{General program flow}
\label{sec:algorithm:general}

\begin{thmproof}[G_0_k]{lemma}{For any $\bar a, k$, we have $G_0^k(\bar a)=0$.}
    There is only one way to partition $\bar a$; using $k+1$ empty partitions. The profit of every member of the partitioning is zero, and therefore the overall profit of the partitioning is zero as well.
\end{thmproof}

\begin{thmproof}[G_p_0]{lemma}{For any $\bar a, p\le|\bar a|$, we have $G_p^0(\bar a)=\PR(S_0^p(\bar a))$.}
    In this case there is only one way to partition $\bar a$ (using one complete partition). The profit is the difference between the sum of the elements and their rounded sum.
\end{thmproof}

\begin{thmproof}[G_p_k]{lemma}{$G_p^k(\bar a) = \max\left(G_p^{k-1}(\bar a), \max\limits_{i=0}^{p-1}\left(G_i^{k-1}(\bar a) + \PR(S_i^p(\bar a))\right)\right)$.}
    Consider the $p$th element of $\bar a$ ($\bar a_{p-1}$) and a partitioning that gives the maximum gain in the substring until $\bar a_p$, $G_p^k(\bar a)$. We have two possibilities:

    \begin{itemize}[itemsep=0pt]
        \item $\bar a_{p-1}$ is in one substring with $\bar a_i, \bar a_{i+1}, \dots, \bar a_{p-2}$ (possibly $i=p-1$). In this case we would have $G_p^k(\bar a) = G_i^{k-1}(\bar a) + \PR(S_i^p(\bar a))$, since we can independently divide the rest of $\bar a$ over $k-1$ partitions and gain a profit of $\PR(S_i^p(\bar a))$ over that last partition.
        \item It is in fact equally worthwhile to create $k$ partitions rather than $k+1$. In this case, we have $G_p^k(\bar a)=G_p^{k-1}(\bar a)$.
    \end{itemize}

    To get the actual profit, we take the maximum of the two possibilities.
\end{thmproof}

This suggests the general flow of our algorithm: we maintain an $(n+1)\times (k+1)$-table of values for $G_p^d$ with $0\le d\le k, 0\le p\le n$. Using the recurrences from \autoref{lem:G_0_k}, \ref{lem:G_p_0} and \ref{lem:G_p_k}, we generate the whole table.

\begin{algorithm}[h]
    \footnotesize
    \caption{Finding the maximum profit $\PP_k(\bar a)$ of sequence $\bar a$ with at most $k+1$ partitions}
    \label{alg:maximum-profit}
    \begin{algorithmic}[1]
        \REQUIRE $\bar a, k$
        \STATE $n\gets |\bar a|$
        \STATE $G[n+1][k+1]$
        \FOR{$0 \le p \le n$}
            \FOR{$0 \le d \le k$}
                \IF{$p = 0$}
                    \STATE $G[p][d] = 0$ \COMMENT{\autoref{lem:G_0_k}}
                \ELSIF{$d = 0$}
                    \STATE $G[p][d] = \PR(S_0^p(\bar a))$ \COMMENT{\autoref{lem:G_p_0}}
                \ELSE
                \STATE $G[p][d] = \max\left(G[p][d-1], \max\limits_{i=0}^{p-1}\left(G[i][d-1] + \PR(S_i^p(\bar a))\right)\right)$ \COMMENT{\autoref{lem:G_p_k}}
                \ENDIF
            \ENDFOR
        \ENDFOR
        \RETURN $G[n][k]$
    \end{algorithmic}
\end{algorithm}

\begin{thmproof}[maximum-profit]{theorem}{\autoref{alg:maximum-profit} is a correct implementation of the $\PP$ function.}
    This follows from \autoref{lem:G_0_k} through \ref{lem:G_p_k}.
\end{thmproof}

The solution to our problem is then $$\PM_k(\bar a) = \sum\bar a-\PP_k(\bar a) = \sum\bar a - G_{|\bar a|}^k(\bar a).$$

\subsection{Optimisations}
\label{sec:algorithm:optimisations}

A first observation we should make is that values in the input sequence $\bar a$ that are divisible by $5$ do not affect the maximum profit of the substring they're in.

\begin{thmproof}[divide-by-5]{lemma}{For any $\bar a,k$, $\PP_k(\bar a) = \PP_k\left(\seq{a_i \in\bar a \mid 5\nmid a_i}\right)$.}
    No matter in which substring (say $\bar a'$), an $a_i$ that is divisible by $5$ will have no influence on $\PR(\bar a')$ and thus not on $\PP_k(\bar a)$.
\end{thmproof}

On average, \autoref{lem:divide-by-5} allows us to apply \autoref{alg:maximum-profit} on a input $\frac54$ times as small as the original input. While not decreasing the Big-O complexity of the algorithm, this does give us faster running times.

Another optimisation concerns the $S$-function. In \autoref{alg:maximum-profit} above, it is called once for every $0\le i\le p$ in every iteration over $p$. Usually we would compute $S$ in linear time, iterating over $\bar a$. Computing \emph{all} the values for $S$ we need would then take quadratic time. However, we may as well use dynamic programming to compute all values for $S$ we're going to need in the beginning of the iteration over $p$, which can be done in linear time. This is demonstrated in \autoref{alg:maximum-profit-optimised}. In this algorithm, \autoref{alg:maximum-profit} has been optimised using \autoref{lem:divide-by-5} and the optimisation just discussed.

\begin{algorithm}[h]
    \footnotesize
    \caption{Optimised version of \autoref{alg:maximum-profit}, to compute $\PP_k(\bar a)$}
    \label{alg:maximum-profit-optimised}
    \begin{algorithmic}[1]
        \REQUIRE $\bar a, k$
        \STATE $\bar a'\gets\seq{a_i\in\bar a\mid 5\nmid a_i}$ \COMMENT{\autoref{lem:divide-by-5}}
        \STATE $n\gets |\bar a'|$
        \STATE Initialise $G[n+1][k+1]$
        \FOR{$0 \le p \le n$}
            \STATE Initialise $S[p]$ \COMMENT{Here we exploit dynamic programming to compute $S$}
            \IF{$p=0$}
                \STATE $S[0] = 0$
            \ELSE
                \STATE $S[p-1] = \bar a'_{p-1}$
                \FOR{$i=p-2$ through $0$}
                    \STATE $S[i] = \bar a'_i + S[i+1]$
                \ENDFOR
            \ENDIF

            \FOR{$0 \le d \le k$}
                \IF{$p = 0$}
                    \STATE $G[p][d] = 0$ \COMMENT{\autoref{lem:G_0_k}}
                \ELSIF{$d = 0$}
                    \STATE $G[p][d] = \PR(S[0])$ \COMMENT{\autoref{lem:G_p_0}}
                \ELSE
                    \STATE $G[p][d] = \max\left(G[p][d-1], \max\limits_{i=0}^{p-1}\left(G[i][d-1] + \PR(S[i])\right)\right)$ \COMMENT{\autoref{lem:G_p_k}} \label{alg:optimised:line:recurrence}
                \ENDIF
            \ENDFOR
        \ENDFOR
        \RETURN $G[n][k]$
    \end{algorithmic}
\end{algorithm}

The correctness of \autoref{alg:maximum-profit-optimised} depends on the correctness of \autoref{alg:maximum-profit} which has been proven in \autoref{thm:maximum-profit}.