1 files changed, 205 insertions, 0 deletions
diff --git a/doc/docs/backend/rts.md b/doc/docs/backend/rts.md
new file mode 100644
index 0000000..d401f3a
--- /dev/null
+++ b/doc/docs/backend/rts.md
@@ -0,0 +1,205 @@
+---
+vim: noexpandtab tabstop=2 shiftwidth=2
+---
+
+# The Snug run time system
+
+The Snug run time system is based on graph reduction. It is based on two
+sources of inspiration:
+
+- The run time system of [Clean][], which I got to know in depth through my
+	work on the [ABC interpreter][]. I am grateful for many discussions with John
+	van Groningen and Rinus Plasmeijer during my work on Clean.
+- The 1987 book *The implementation of functional programming languages* by
+	Simon L. Peyton Jones (henceforth IFPL).
+
+At times, unorthodox choices have been made to reduce memory usage as much as
+possible, sacrificing speed when necessary.
+
+## Node layout
+
+There are two types of nodes: head normal forms (hnfs) and thunks.
+
+Both types of nodes are of variable width. Thunks have at least 2 words so that
+they can always be overwritten with an indirection. Hnfs have no minimum size.
+
+The first word of a hnf node points to a [descriptor](#hnf-descriptors)
+containing information such as arity and constructor name (for printing). The
+first word of a thunk node points to the code that can evaluate the thunk.
+Information about the arity can be found above this code (see
+[thunk descriptors](#thunk-descriptors)). Thus the nodes look like this:
+
+| Descriptor | Argument 1    | Argument 2    |
+|------------|---------------|---------------|
+| CONS       | (ptr to head) | (ptr to tail) |
+
+| Descriptor | Argument 1   | Argument 2   |
+|------------|--------------|--------------|
+| ++         | (ptr to arg) | (ptr to arg) |
+
+The two types of nodes can be distinguished because the descriptor of hnf nodes
+is in the data section, while the code to evaluate a thunk is in the text
+section. To quickly check whether a node is a hnf or a thunk, the address of
+the boundary of the data and text section is held in a
+[register](registers.md).
+
+The arguments of some built-in constructors can be unboxed:
+
+| Descriptor | Argument 1 |
+|------------|------------|
+| INT        | 37         |
+
+## Descriptors
+
+### Hnf descriptors
+Descriptors of hnf nodes are located in the data section. They contain
+information about the arity of the node. It also contains the number of
+arguments to be curried in, which is 0 for all data constructors but non-zero
+for the [descriptors of closures](#closure-descriptors). (This means that, in
+principle, data constructor descriptors only need a halfword.)
+
+```mipsasm
+	.align	1
+__main_cTuple:
+	.byte	2	# pointer arity
+	.byte	0	# basic value arity
+	.byte	0	# number of arguments to be curried in
+```
+
+### Thunk descriptors
+Thunk nodes point to the code that can evaluate them. Above this code a
+descriptor is used storing information about arity and strictness:
+
+```mipsasm
+	.align	1
+	.byte	0x1	# strict in 1st argument
+	.byte	2	# arity
+__main_nappend:
+	# ...
+```
+
+!!! warning "Implementation may change"
+	Strictness will probably be removed in the future, as evaluating arguments
+	will be moved out of the driver into generated code (see
+	[evaluation](#evaluation)).
+
+	This also makes the implementation of `ap` simpler, as it does not need to
+	check strictness of the function it evaluates any more.
+
+### Closure descriptors
+For each thunk with a non-zero number of arguments there is also a closure
+descriptor table in the data section. This is a table of hnf descriptors used
+for closures (unsaturated function applications) of the thunk. It looks like
+this:
+
+```mipsasm
+	.align	2
+	# entry for closure with 0 arguments:
+	.byte	0	# pointer arity
+	.byte	0	# basic value arity
+	.byte	2	# number of arguments that are still to be curried in
+	.byte	0	# reserved
+	# entry for closure with 1 argument:
+	.byte	1	# pointer arity
+	.byte	0	# basic value arity
+	.byte	1	# number of arguments that are still to be curried in
+	.byte	0	# reserved
+	# pointer to corresponding code entry:
+__main_uappend:
+	.word	__main_nappend
+```
+
+The node of a closure that still needs *n* arguments is given the descriptor
+`__MODULE_uFUNCTION`-4\**n*.
+
+Data descriptors also have a closure descriptor table. The final word points to
+a thunk entry that simply builds a hnf of the corresponding saturated data
+descriptor.
+
+Arguments can be curried into a closure using a special `ap` thunk. It checks
+the number of arguments that are still to be curried in. If this number is 1,
+it evaluates the function; if not, it creates a new closure:
+
+```mipsasm
+	.align	1
+	.byte	0x1	# strict in the first argument
+	.byte	2	# pointer arity
+_nap:
+	# custom implementation in assembly
+```
+
+!!! note "Optimization"
+	This can be optimized for space by adding `ap2`, `ap3`, etc., for currying in
+	multiple arguments at once. That way we can, for example, use a 5-word `ap3`
+	node for currying in three arguments, instead of using a chain of three `ap`
+	nodes (9 words in total). However, since this is presumably not very
+	frequent, it is not high priority, especially not for large numbers of
+	arguments.
+
+## Evaluation
+
+Evaluation is done using pointer reversal so that the stack can be avoided. The
+technique is described in IFPL §11.6.1. The idea is as follows.
+
+We have a *back pointer* which initially points to a special ROOT node. We have
+a *front pointer* which initially points to the root of the graph to be
+evaluated. During evaluation, the front pointer will always point to the node
+under evaluation, and the back pointer will always point to its parent node.
+
+To evaluate a node, its strict arguments must be evaluated. To do this, the
+following steps are executed simultaneously:
+
+- The pointer to the child node is updated with the value of the back pointer
+	and marked by setting the least significant bit.
+- The back pointer is updated to the front pointer.
+- The front pointer is updated to the child node.
+- Jump to the evaluation entry of the child node.
+
+The old back pointer can be recovered by finding the marked cell in the parent
+node, which is now pointed to by the back pointer. Therefore, after evaluation
+of the child node, the following steps are performed simultaneously:
+
+- The cell pointing to the child node is recovered by checking which argument
+	of the parent node (pointed to by the back pointer) is marked.
+- The cell pointing to the child node is updated with the new address of the
+	child node (it may have changed if the old node could not be overwritten
+	because it was smaller, or in case garbage collection ran).
+- The back pointer is set to the old back pointer (the value in the cell
+	pointing to the child node).
+- The front pointer is set to the back pointer.
+- Jump to the evaluation entry of the parent node.
+
+The parent node will then check again if the child that has just been evaluated
+needs to be evaluated. This isn't terribly efficient, but we assume there to be
+few strict arguments.
+
+!!! warning "Implementation may change"
+	At the moment, thunk descriptors contain information about strictness. This
+	way, the central `eval` driver can check if arguments of a thunk node need to
+	be evaluated before evaluating the thunk itself.
+
+	It is more efficient, and does not cost a lot in terms of program size, if
+	thunks get a separate entry point that first evaluates its strict arguments.
+	The descriptor then does not need to include strictness information, and the
+	central `eval` driver does not need to use this to dynamically evaluate
+	children.
+
+## Garbage collection
+
+Not implemented yet: a variant of Jonkers' algorithm, using the most
+significant bit for pointer reversal. The most significant bit is always set on
+the PIC32 devices we target.
+
+## Reserved bits
+
+The design leaves some space in addresses for special use of the run time
+system:
+
+- The least significant bit is always zero and can be used for pointer reversal
+	during [evaluation](#evaluation).
+- The most significant bit is always one (check the data sheet for the memory
+	map), so it can be used for pointer reversal during [garbage
+	collection](#garbage-collection).
+
+[ABC interpreter]: https://clean-lang.org/pkg/abc-interpreter
+[Clean]: https://clean-lang.org