diff options
author | Camil Staps | 2016-12-24 22:07:10 +0100 |
---|---|---|
committer | Camil Staps | 2016-12-24 22:07:10 +0100 |
commit | 022f6b04872b2185ee60808d0f589f4e06a78e35 (patch) | |
tree | 58639a47ea46aae42a9401384ac40ff3240bc720 /thesis | |
parent | Add some todos (diff) |
Foreign function interface and register allocation optimisation
Diffstat (limited to 'thesis')
-rw-r--r-- | thesis/reg-alloc.tex | 43 |
1 files changed, 41 insertions, 2 deletions
diff --git a/thesis/reg-alloc.tex b/thesis/reg-alloc.tex index 0a2e87b..482cd8c 100644 --- a/thesis/reg-alloc.tex +++ b/thesis/reg-alloc.tex @@ -203,7 +203,7 @@ The counting method used here is rather simplistic: A more accurate method would only count those instructions where using a high or low register actually makes a difference. This is much more complicated, and for a rough estimate the simplistic method used here will already allow us to shrink down the code size. -\begin{figure*}[t] +\begin{figure*}[b] \small \centering \begin{tikzpicture} @@ -272,7 +272,46 @@ This way, all eight most often used registers are in the lower half except the B \subsection{The foreign function interface} \label{sec:reg-alloc:ffi} -\todo{discuss how this allocation changes the FFI considering the calling convention.} +Changing the register allocation has its impact on the foreign function interface. +Clean provides mechanisms to export Clean functions, so that they can be called from other software, + and to call other functions from Clean, + as long as the other software respects a certain high-level infrastructure \parencite[chp.~11]{cleanlangrep}. + +The low-level interface (which registers are used, for example) is platform-dependent. +For ARM, it is defined in \cite{armcallstd}. + +Some registers have a special function: + \ual{r15} is the program counter; + \ual{r13} the stack pointer. +The Clean backend cannot use these registers in another way. +There are four argument / result / scratch registers, \ual{r0} through \ual{r3}. +These are not guaranteed to be preserved upon a function call. +The link register, \ual{r14}, and \ual{r12}, cannot be used freely either: + a subroutine jumps to the address in the link register when it is done + (and can use \ual{r14} as a local variable if it stores its value upon entering on the stack); + and \ual{r12} can be used by the linker when extra instructions are needed + when a branch instruction attempts to jump to a label so far away that it does not fit in the instruction any more \parencite[5.3.1.1]{armcallstd}. +For local variables, \ual{r4} through \ual{r8}, \ual{r10} and \ual{r11} can be used: + these have to be preserved by subroutines. +The last register, \ual{r9}, is platform-dependent. +Some quick tests indicate that it is callee-saved on our test setup (see \cref{sec:system}). + +The register allocation in the ARM backend is optimised for the foreign function interface: + the B-stack registers are in the argument registers, because the B-stack is usually empty during function calls. +The link register \ual{r14} and \ual{r12} are used as scratch registers, + because they are only used for short term storage. +All other variables have to be kept over subroutine calls and are kept in the other registers, which are callee-saved. + +Changing the register allocation in the way proposed above means that this foreign function interface will be less efficient. +Whenever a foreign function needs to be called, the caller-saved registers that need to be preserved have to be saved. +With the allocation as proposed in \cref{tab:reg-alloc:arm-and-thumb}, + these are the heap pointer (\ual{r1}), A2 (\ual{r2}) and A3 (\ual{r12}). +Before every call, a wide \ual{push} instruction needs to be inserted; + after every return, a wide \ual{pop} instruction. +This introduces an 8-byte overhead per foreign function call + and also means that every call will be slightly slower. +We deem the foreign function interface to be less important than the actual Clean code, + so this is acceptable. \subsection{Results} \label{sec:reg-alloc:results} |