summaryrefslogtreecommitdiff
path: root/thesis
diff options
context:
space:
mode:
Diffstat (limited to 'thesis')
-rw-r--r--thesis/storing-pc.tex21
1 files changed, 21 insertions, 0 deletions
diff --git a/thesis/storing-pc.tex b/thesis/storing-pc.tex
index 81a688b..40da1d1 100644
--- a/thesis/storing-pc.tex
+++ b/thesis/storing-pc.tex
@@ -128,6 +128,27 @@ The offset, 9, is calculated as the number of bytes to the instruction after the
For \ual{b} and \ual{bl} instructions, this means an offset of 9, since these instructions are 32-bit.
The \ual{bx} and \ual{blx} instructions are 16-bit, and require an offset of 7.
+\subsection{Implementation details}
+\label{sec:storing-pc:implementation}
+A jump always jumps to either a label (with \ual{b} or \ual{bl}) or a register (with \ual{bx} or \ual{blx}).
+The latter occurs for example in the case of a \abc{jsr_eval} ABC instruction.
+This instruction is used to evaluate a node on the A-stack.
+First, the node entry address has to be fetched; then we jump to that address.
+
+In the first case, we need one scratch register to store the PC temporarily.
+In the second case, we need two scratch registers: also one for the address we are jumping to.
+For the ARM instruction set we needed zero and one scratch register(s), respectively.
+The ARM backend uses two scratch registers, S0 and S1 (for more details, see \cref{sec:reg-alloc:clean}).
+The latter is used only when two scratch registers are needed, so S0 is used in this case.
+
+The Thumb-2 code generator uses S0 to store the PC temporarily in the first case,
+ and S1 in the second case (where S0 is still used for the address we are jumping to).
+This makes sure that S0 is used as much as possible.
+The slightly easier implementation would use S1 in both cases.
+However, in Thumb-2 it is convenient to have great variation in register usage:
+ this allows for a massive code size optimisation (see \cref{sec:reg-alloc}).
+For this reason it is better to use S0 whenever possible.
+
\subsection{Comparison}
\label{sec:storing-pc:comparison}
Assuming the worst case, that all instructions in the jump block are wide, we need four more bytes in Thumb than in ARM.