1 files changed, 103 insertions, 3 deletions
diff --git a/thesis/intro.tex b/thesis/intro.tex
index 5be42af..0b75b49 100644
--- a/thesis/intro.tex
+++ b/thesis/intro.tex
@@ -1,6 +1,56 @@
 \section{Introduction}
 \label{sec:intro}
 
+\begin{figure*}[b]
+	\centering
+	\begin{subfigure}[b]{.2\linewidth}
+		\centering
+		\begin{tikzpicture}[clean]
+			\node (start) {\clean{Start}};
+		\end{tikzpicture}
+		\caption{Initially}
+	\end{subfigure}%
+	\begin{subfigure}[b]{.2\linewidth}
+		\centering
+		\begin{tikzpicture}[clean]
+			\node[node args=1] (fac) {\clean{fac}};
+			\node[below=of fac.arg1.south] (4) {\clean{4}};
+			\draw (fac.arg1) -- (4.north);
+		\end{tikzpicture}
+		\caption{Applying \clean{Start}}
+	\end{subfigure}%
+	\begin{subfigure}[b]{.3\linewidth}
+		\centering
+		\begin{tikzpicture}[clean]
+			\node[node args=2] (times) {\clean{*}};
+			\node[below=of times.arg1.south] (4) {\clean{4}};
+			\node[node args=1,right=of times.arg2.east] (fac) {\clean{fac}};
+			\node[node args=2,right=of fac.arg1.east] (minus) {\clean{-}};
+			\node[right=of minus.arg2.east] (1) {\clean{1}};
+			\draw (times.arg1) -- (4.north);
+			\draw (times.arg2) -- (fac.west);
+			\draw (fac.arg1) -- (minus.west);
+			\draw (minus.arg1) |- (4.east);
+			\draw (minus.arg2) -- (1.west);
+		\end{tikzpicture}
+		\caption{Applying \clean{fac n}}
+	\end{subfigure}%
+	\begin{subfigure}[b]{.2\linewidth}
+		\centering
+		\begin{tikzpicture}[clean]
+			\node[node args=2] (times) {\clean{*}};
+			\node[below=of times.arg1.south] (4) {\clean{4}};
+			\node[node args=1,right=of times.arg2.east] (fac) {\clean{fac}};
+			\node[below=of fac.arg1.south] (3) {\clean{3}};
+			\draw (times.arg1) -- (4.north);
+			\draw (times.arg2) -- (fac.west);
+			\draw (fac.arg1) -- (3.north);
+		\end{tikzpicture}
+		\caption{Applying \clean{-}}
+	\end{subfigure}
+	\caption{Rewriting a Clean node\label{fig:intro:rewriting}}
+\end{figure*}
+
 \begin{multicols}{2}
 	
 \subsection{Introduction}
@@ -24,20 +74,70 @@ Starting from ARMv6T2, an extension to the Thumb instruction set, known as Thumb
 This gives the ARM and Thumb instruction sets `almost identical functionality'~\citep[A1.2]{armv7ar},
 	whereas Thumb gives a smaller code size.
 
+Using the Unified Assembly Language (UAL), one can write assembly code for both the ARM and the Thumb instruction set and fix the target ISA only at assemble-time.
+
+\label{sec:intro:interworking}
+The ARM and Thumb instruction sets are designed to \emph{interwork}:
+	different parts of a program can be assembled for different instruction sets
+	and it is possible to switch instruction set when the program counter is written~\citep[A4.1]{armv7ar}.
+The Thumb-2 code generator proposed in this thesis does not produce ARM code,
+	though the existence of the interworking facility has effects on the techniques that can be used in it.
+This will be covered in \autoref{sec:two-bits}.
+
 \subsection{Clean}
 \label{sec:intro:clean}
 Clean is `a general purpose, state-of-the-art, pure and lazy functional programming language designed for making real-world applications'~\cite{clean}.
 It evolved from LEAN, an intermediate language based on graph rewriting~\citep{lean}.
-As such, one of the steps in the compilation of a Clean program is the translation to bytecode for the ABC-machine, an abstract graph rewriting machine~\citep{execspecs}.
-A code generator is used to generate code equivalent to the ABC-code for concrete machines.
-%todo
+A Clean program consists of a list of rewrite rules, for example:
+
+\begin{minted}{clean}
+	fac 0 = 1
+	fac n = n * (fac (n - 1))
+	Start = fac 4
+\end{minted}
+
+Executing a Clean program means rewriting the \clean{Start} rule until no rewriting is possible any more.
+The first rewriting steps of the above program are shown in \autoref{fig:intro:rewriting}.
+
+A Clean program is compiled to ABC-code, a platform-independent bytecode for an abstract graph rewriting machine~\citep{execspecs}.
+A code generator is used to generate machine code equivalent to the ABC-code for concrete machines.
+This, again, is done in several steps, so that only a relatively small part of the compilation process is target-dependent.
+For adapting the ARM code generator to work for Thumb-2 one needs to have only very basic knowledge of the ABC-machine.
+For this reason we will not discuss its internals here ---
+	they have been described in \cite{execspecs} and \cite{m680x0}.
+
+After compilation, a Clean program is linked together with the Clean run-time system.
+The RTS ensures that that the \clean{Start} node is reduced and printed and takes care of garbage collecting.
+This system is written in C and assembly, so making Clean available on Thumb-2 inherently means adapting the platform-dependent parts of the RTS as well.
+
+\subsection{A Thumb backend for Clean}
+\label{sec:intro:clean-thumb}
+In this thesis, we propose a Thumb backend for Clean.
+The ARM code generator and RTS were taken as a starting point.
+Thanks to the Unified Assembly Language, only little time was needed to make these systems assemble for the Thumb instruction set.
+
+The proposed backend does not use ARM and Thumb interworking.
+The reason for this is threefold.
+First, there are several processors, like the ARMv7-M series~\citep[A4.1]{armv7m}, that do support Thumb instructions but cannot run ARM code.
+By only using Thumb, we target the widest possible range of devices.
+Second, we doubt that using interworking can be done efficiently.
+In the run-time system, only minimal time overhead is introduced by using Thumb instructions.
+For generated code it would be complicated to detect if the ARM or Thumb instruction set would give better results,
+	and this would give significantly better results only in specific cases.
+Third, the problem discussed in \autoref{sec:two-bits} could be solved efficiently only without interworking.
+Using interworking would introduce overhead at every branch instruction,
+	since the solution to this problem would have to be adapted.
 
 \subsection{Terminology}
 \label{sec:intro:terminology}
 In this thesis, I will usually write `Thumb' where the Thumb-2 extension is meant.
 Only when the distinction with pre-ARMv6T2 Thumb is important will I distinguish between (early) Thumb and Thumb-2.
+As for `ARM', it should be clear from the context whether the architecture or the instruction set is meant.
 
 For brevity, a number of common abbreviations is used.
 An overview can be found in \autoref{sec:abbreviations}.
 
+\subsection{Organisation}
+\label{sec:intro:organisation}
+
 \end{multicols}