--- vim: noexpandtab tabstop=2 shiftwidth=2 --- # The Snug run time system The Snug run time system is based on graph reduction. It is based on two sources of inspiration: - The run time system of [Clean][], which I got to know in depth through my work on the [ABC interpreter][]. I am grateful for many discussions with John van Groningen and Rinus Plasmeijer during my work on Clean. - The 1987 book *The implementation of functional programming languages* by Simon L. Peyton Jones (henceforth IFPL). At times, unorthodox choices have been made to reduce memory usage as much as possible, sacrificing speed when necessary. ## Node layout There are two types of nodes: head normal forms (hnfs) and thunks. Both types of nodes are of variable width. Thunks have at least 2 words so that they can always be overwritten with an indirection. Hnfs have no minimum size. The first word of a hnf node points to a [descriptor](#hnf-descriptors) containing information such as arity and constructor name (for printing). The first word of a thunk node points to the code that can evaluate the thunk. Information about the arity can be found above this code (see [thunk descriptors](#thunk-descriptors)). Thus the nodes look like this: | Descriptor | Argument 1 | Argument 2 | |------------|---------------|---------------| | CONS | (ptr to head) | (ptr to tail) | | Descriptor | Argument 1 | Argument 2 | |------------|--------------|--------------| | ++ | (ptr to arg) | (ptr to arg) | The two types of nodes can be distinguished because the descriptor of hnf nodes is in the data section, while the code to evaluate a thunk is in the text section. To quickly check whether a node is a hnf or a thunk, the address of the boundary of the data and text section is held in a [register](registers.md). The arguments of some built-in constructors can be unboxed: | Descriptor | Argument 1 | |------------|------------| | INT | 37 | ## Descriptors ### Hnf descriptors Descriptors of hnf nodes are located in the data section. They contain information about the arity of the node. ```mipsasm .align 1 __main_cTuple: .byte 2 # pointer arity .byte 0 # basic value arity ``` NB: [descriptors of closures](#closure-descriptors) also contain the number of arguments to be curried in (minus 1). ### Thunk descriptors Thunk nodes point to the code that can evaluate them. Above this code a descriptor is used storing information about arity and strictness: ```mipsasm .align 1 .byte 0x1 # strict in 1st argument .byte 2 # arity __main_nappend: # ... ``` !!! warning "Implementation will change" Strictness will be removed in the future, as evaluating arguments will be moved out of the driver into generated code (see [evaluation](#evaluation)). This makes the implementation of `ap` simpler, as it does not need to check strictness of the function it evaluates any more. Currently `ap` assumes all arguments are strict. ### Closure descriptors For each thunk with a non-zero number of arguments there is also a closure descriptor table in the data section. This is a table of hnf descriptors used for closures (unsaturated function applications) of the thunk. It looks like this: ```mipsasm .align 2 # entry for closure with 0 arguments: .byte 0 # pointer arity .byte 0 # basic value arity .byte 1 # number of arguments that are still to be curried in minus 1 .byte 0 # reserved # entry for closure with 1 argument: .byte 1 # pointer arity .byte 0 # basic value arity .byte 0 # number of arguments that are still to be curried in minus 1 .byte 0 # reserved # pointer to corresponding code entry: __main_uappend: .word __main_nappend ``` The node of a closure that still needs *n* arguments is given the descriptor `__MODULE_uFUNCTION`-4\**n*. Data descriptors also have a closure descriptor table. The final word points to a thunk entry that simply builds a hnf of the corresponding saturated data descriptor. Arguments can be curried into a closure using a special `ap` thunk. It checks the number of arguments that are still to be curried in. If this number is 1, it evaluates the function; if not, it creates a new closure: ```mipsasm .align 1 .byte 0x1 # strict in the first argument .byte 2 # pointer arity _nap: # custom implementation in assembly ``` !!! note "Optimization" This can be optimized for space by adding `ap2`, `ap3`, etc., for currying in multiple arguments at once. That way we can, for example, use a 5-word `ap3` node for currying in three arguments, instead of using a chain of three `ap` nodes (9 words in total). However, since this is presumably not very frequent, it is not high priority, especially not for large numbers of arguments. ## Evaluation Evaluation is done using pointer reversal so that the stack can be avoided. The technique is described in IFPL ยง11.6.1. The idea is as follows. We have a *back pointer* which initially points to a special ROOT node. We have a *front pointer* which initially points to the root of the graph to be evaluated. During evaluation, the front pointer will always point to the node under evaluation, and the back pointer will always point to its parent node. To evaluate a node, its strict arguments must be evaluated. To do this, the following steps are executed simultaneously: - The pointer to the child node is updated with the value of the back pointer and marked by setting the least significant bit. - The back pointer is updated to the front pointer. - The front pointer is updated to the child node. - Jump to the evaluation entry of the child node. The old back pointer can be recovered by finding the marked cell in the parent node, which is now pointed to by the back pointer. Therefore, after evaluation of the child node, the following steps are performed simultaneously: - The cell pointing to the child node is recovered by checking which argument of the parent node (pointed to by the back pointer) is marked. - The cell pointing to the child node is updated with the new address of the child node (it may have changed if the old node could not be overwritten because it was smaller, or in case garbage collection ran). - The back pointer is set to the old back pointer (the value in the cell pointing to the child node). - The front pointer is set to the back pointer. - Jump to the evaluation entry of the parent node. The parent node will then check again if the child that has just been evaluated needs to be evaluated. This isn't terribly efficient, but we assume there to be few strict arguments. !!! warning "Implementation may change" At the moment, thunk descriptors contain information about strictness. This way, the central `eval` driver can check if arguments of a thunk node need to be evaluated before evaluating the thunk itself. It is more efficient, and does not cost a lot in terms of program size, if thunks get a separate entry point that first evaluates its strict arguments. The descriptor then does not need to include strictness information, and the central `eval` driver does not need to use this to dynamically evaluate children. ## Garbage collection Not implemented yet: a variant of Jonkers' algorithm, using the most significant bit for pointer reversal. The most significant bit is always set on the PIC32 devices we target. ## Reserved bits The design leaves some space in addresses for special use of the run time system: - The least significant bit is always zero and can be used for pointer reversal during [evaluation](#evaluation). - The most significant bit is always one (check the data sheet for the memory map), so it can be used for pointer reversal during [garbage collection](#garbage-collection). [ABC interpreter]: https://clean-lang.org/pkg/abc-interpreter [Clean]: https://clean-lang.org