1 files changed, 92 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..cb3b9e4
--- /dev/null
+++ b/README.md
@@ -0,0 +1,92 @@
+# Tests for an LLVM-based minimum effort code generator for ABC
+
+This directory contains some preliminary tests to see if a code generator for
+ABC could be built with minimum effort using LLVM.
+
+The general idea is due to Erin van der Veen, and his repository is located at
+https://gitlab.com/top-software/llvm-code-generator.
+
+## General idea
+
+Non-control-flow ABC instructions are implemented in LLVM IR, in this directory
+in `rts.ll`. For example:
+
+```llvm
+attributes #0 = { alwaysinline }
+
+define private i64* @addI(i64* %bsp.0) #0 {
+  %t.0 = load i64, i64* %bsp.0
+  store i64 undef, i64* %bsp.0
+  %bsp.1 = getelementptr i64, i64* %bsp.0, i64 1
+  %t.1 = load i64, i64* %bsp.1
+  %t.2 = add i64 %t.0, %t.1
+  store i64 %t.2, i64* %bsp.1
+  ret i64* %bsp.1
+}
+```
+
+ABC code can then be compiled without much effort; we only need to cleverly
+break the ABC code up into subroutines and handle the control flow instructions
+correctly. For example:
+
+```llvm
+  %bsp.307 = call i64* @pushI(i64 %r.1, i64* %bsp.306.0)
+  %bsp.308 = call i64* @addI(i64* %bsp.307)
+  %bsp.309 = call i64* @pushI(i64 1, i64* %bsp.308)
+  %bsp.310 = call i64* @push_b(i64 1, i64* %bsp.309)
+  ; ...
+```
+
+Because the definitions of the instructions (e.g. `addI`) are `alwaysinline`,
+there is no overhead for these calls, and we can rely on LLVM to optimize
+subroutine bodies after the `alwaysinline` pass.
+
+This means we need a custom LLVM pipeline, implemented in the Makefile as:
+
+```sh
+	| opt-11 -S -always-inline \
+	| opt-11 -S -O3 \
+	| sed 's/noinline nounwind optnone/nounwind/' \
+	| opt-11 -S -O3 \
+```
+
+## Stacks
+
+B-stack arguments are passed as subroutine arguments and return values; there
+is no global B-stack. Subroutines do have a global A-stack:
+
+```llvm
+define private i64 @s1(i64** %globasp, i64 %arg) #0 {
+  %astack = alloca i64*, i64 10000
+  %asp.000 = getelementptr i64*, i64** %astack
+  %bstack = alloca i64, i64 10000
+  %bsp.000 = getelementptr i64, i64* %bstack, i64 9999
+
+  %bsp.001 = call i64* @pushI(i64 %arg, i64* %bsp.000)
+  ; ...
+```
+
+This ensures that the B-stack is combined with the C-stack, and the `alloca`s
+for subroutine-local stacks are optimized away by LLVM.
+
+## Control flow
+
+Example of a `jmp_false else.1` instruction, using `peek_b` and `pop_b1` from
+the RTS:
+
+```llvm
+  %t.0 = call i64 @peek_b(i64* %bsp.102)
+  %bsp.103 = call i64* @pop_b1(i64* %bsp.102)
+  %t.1 = trunc i64 %t.0 to i1
+  br i1 %t.1, label %l.0, label %else.1
+```
+
+Example of a `jsr s1` instruction with one B-stack argument and one B-stack
+return value:
+
+```llvm
+  %arg.0 = call i64 @peek_b(i64* %bsp.302)
+  %bsp.302.0 = call i64* @pop_b1(i64* %bsp.302)
+  %r.0 = call i64 @s1(i64** %globasp, i64 %arg.0)
+  %bsp.303 = call i64* @pushI(i64 %r.0, i64* %bsp.302.0)
+```