summaryrefslogtreecommitdiff
path: root/thesis/intro.tex
diff options
context:
space:
mode:
Diffstat (limited to 'thesis/intro.tex')
-rw-r--r--thesis/intro.tex106
1 files changed, 103 insertions, 3 deletions
diff --git a/thesis/intro.tex b/thesis/intro.tex
index 5be42af..0b75b49 100644
--- a/thesis/intro.tex
+++ b/thesis/intro.tex
@@ -1,6 +1,56 @@
\section{Introduction}
\label{sec:intro}
+\begin{figure*}[b]
+ \centering
+ \begin{subfigure}[b]{.2\linewidth}
+ \centering
+ \begin{tikzpicture}[clean]
+ \node (start) {\clean{Start}};
+ \end{tikzpicture}
+ \caption{Initially}
+ \end{subfigure}%
+ \begin{subfigure}[b]{.2\linewidth}
+ \centering
+ \begin{tikzpicture}[clean]
+ \node[node args=1] (fac) {\clean{fac}};
+ \node[below=of fac.arg1.south] (4) {\clean{4}};
+ \draw (fac.arg1) -- (4.north);
+ \end{tikzpicture}
+ \caption{Applying \clean{Start}}
+ \end{subfigure}%
+ \begin{subfigure}[b]{.3\linewidth}
+ \centering
+ \begin{tikzpicture}[clean]
+ \node[node args=2] (times) {\clean{*}};
+ \node[below=of times.arg1.south] (4) {\clean{4}};
+ \node[node args=1,right=of times.arg2.east] (fac) {\clean{fac}};
+ \node[node args=2,right=of fac.arg1.east] (minus) {\clean{-}};
+ \node[right=of minus.arg2.east] (1) {\clean{1}};
+ \draw (times.arg1) -- (4.north);
+ \draw (times.arg2) -- (fac.west);
+ \draw (fac.arg1) -- (minus.west);
+ \draw (minus.arg1) |- (4.east);
+ \draw (minus.arg2) -- (1.west);
+ \end{tikzpicture}
+ \caption{Applying \clean{fac n}}
+ \end{subfigure}%
+ \begin{subfigure}[b]{.2\linewidth}
+ \centering
+ \begin{tikzpicture}[clean]
+ \node[node args=2] (times) {\clean{*}};
+ \node[below=of times.arg1.south] (4) {\clean{4}};
+ \node[node args=1,right=of times.arg2.east] (fac) {\clean{fac}};
+ \node[below=of fac.arg1.south] (3) {\clean{3}};
+ \draw (times.arg1) -- (4.north);
+ \draw (times.arg2) -- (fac.west);
+ \draw (fac.arg1) -- (3.north);
+ \end{tikzpicture}
+ \caption{Applying \clean{-}}
+ \end{subfigure}
+ \caption{Rewriting a Clean node\label{fig:intro:rewriting}}
+\end{figure*}
+
\begin{multicols}{2}
\subsection{Introduction}
@@ -24,20 +74,70 @@ Starting from ARMv6T2, an extension to the Thumb instruction set, known as Thumb
This gives the ARM and Thumb instruction sets `almost identical functionality'~\citep[A1.2]{armv7ar},
whereas Thumb gives a smaller code size.
+Using the Unified Assembly Language (UAL), one can write assembly code for both the ARM and the Thumb instruction set and fix the target ISA only at assemble-time.
+
+\label{sec:intro:interworking}
+The ARM and Thumb instruction sets are designed to \emph{interwork}:
+ different parts of a program can be assembled for different instruction sets
+ and it is possible to switch instruction set when the program counter is written~\citep[A4.1]{armv7ar}.
+The Thumb-2 code generator proposed in this thesis does not produce ARM code,
+ though the existence of the interworking facility has effects on the techniques that can be used in it.
+This will be covered in \autoref{sec:two-bits}.
+
\subsection{Clean}
\label{sec:intro:clean}
Clean is `a general purpose, state-of-the-art, pure and lazy functional programming language designed for making real-world applications'~\cite{clean}.
It evolved from LEAN, an intermediate language based on graph rewriting~\citep{lean}.
-As such, one of the steps in the compilation of a Clean program is the translation to bytecode for the ABC-machine, an abstract graph rewriting machine~\citep{execspecs}.
-A code generator is used to generate code equivalent to the ABC-code for concrete machines.
-%todo
+A Clean program consists of a list of rewrite rules, for example:
+
+\begin{minted}{clean}
+ fac 0 = 1
+ fac n = n * (fac (n - 1))
+ Start = fac 4
+\end{minted}
+
+Executing a Clean program means rewriting the \clean{Start} rule until no rewriting is possible any more.
+The first rewriting steps of the above program are shown in \autoref{fig:intro:rewriting}.
+
+A Clean program is compiled to ABC-code, a platform-independent bytecode for an abstract graph rewriting machine~\citep{execspecs}.
+A code generator is used to generate machine code equivalent to the ABC-code for concrete machines.
+This, again, is done in several steps, so that only a relatively small part of the compilation process is target-dependent.
+For adapting the ARM code generator to work for Thumb-2 one needs to have only very basic knowledge of the ABC-machine.
+For this reason we will not discuss its internals here ---
+ they have been described in \cite{execspecs} and \cite{m680x0}.
+
+After compilation, a Clean program is linked together with the Clean run-time system.
+The RTS ensures that that the \clean{Start} node is reduced and printed and takes care of garbage collecting.
+This system is written in C and assembly, so making Clean available on Thumb-2 inherently means adapting the platform-dependent parts of the RTS as well.
+
+\subsection{A Thumb backend for Clean}
+\label{sec:intro:clean-thumb}
+In this thesis, we propose a Thumb backend for Clean.
+The ARM code generator and RTS were taken as a starting point.
+Thanks to the Unified Assembly Language, only little time was needed to make these systems assemble for the Thumb instruction set.
+
+The proposed backend does not use ARM and Thumb interworking.
+The reason for this is threefold.
+First, there are several processors, like the ARMv7-M series~\citep[A4.1]{armv7m}, that do support Thumb instructions but cannot run ARM code.
+By only using Thumb, we target the widest possible range of devices.
+Second, we doubt that using interworking can be done efficiently.
+In the run-time system, only minimal time overhead is introduced by using Thumb instructions.
+For generated code it would be complicated to detect if the ARM or Thumb instruction set would give better results,
+ and this would give significantly better results only in specific cases.
+Third, the problem discussed in \autoref{sec:two-bits} could be solved efficiently only without interworking.
+Using interworking would introduce overhead at every branch instruction,
+ since the solution to this problem would have to be adapted.
\subsection{Terminology}
\label{sec:intro:terminology}
In this thesis, I will usually write `Thumb' where the Thumb-2 extension is meant.
Only when the distinction with pre-ARMv6T2 Thumb is important will I distinguish between (early) Thumb and Thumb-2.
+As for `ARM', it should be clear from the context whether the architecture or the instruction set is meant.
For brevity, a number of common abbreviations is used.
An overview can be found in \autoref{sec:abbreviations}.
+\subsection{Organisation}
+\label{sec:intro:organisation}
+
\end{multicols}