1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
|
# Tests for an LLVM-based minimum effort code generator for ABC
This directory contains some preliminary tests to see if a code generator for
ABC could be built with minimum effort using LLVM.
The general idea is due to Erin van der Veen, and his repository is located at
https://gitlab.com/top-software/llvm-code-generator.
## General idea
Non-control-flow ABC instructions are implemented in LLVM IR, in this directory
in `rts.ll`. For example:
```llvm
attributes #0 = { alwaysinline }
define private i64* @addI(i64* %bsp.0) #0 {
%t.0 = load i64, i64* %bsp.0
store i64 undef, i64* %bsp.0
%bsp.1 = getelementptr i64, i64* %bsp.0, i64 1
%t.1 = load i64, i64* %bsp.1
%t.2 = add i64 %t.0, %t.1
store i64 %t.2, i64* %bsp.1
ret i64* %bsp.1
}
```
ABC code can then be compiled without much effort; we only need to cleverly
break the ABC code up into subroutines and handle the control flow instructions
correctly. For example:
```llvm
%bsp.307 = call i64* @pushI(i64 %r.1, i64* %bsp.306.0)
%bsp.308 = call i64* @addI(i64* %bsp.307)
%bsp.309 = call i64* @pushI(i64 1, i64* %bsp.308)
%bsp.310 = call i64* @push_b(i64 1, i64* %bsp.309)
; ...
```
Because the definitions of the instructions (e.g. `addI`) are `alwaysinline`,
there is no overhead for these calls, and we can rely on LLVM to optimize
subroutine bodies after the `alwaysinline` pass.
This means we need a custom LLVM pipeline, implemented in the Makefile as:
```sh
| opt-11 -S -always-inline \
| opt-11 -S -O3 \
| sed 's/noinline nounwind optnone/nounwind/' \
| opt-11 -S -O3 \
```
## Stacks
B-stack arguments are passed as subroutine arguments and return values; there
is no global B-stack. Subroutines do have a global A-stack:
```llvm
define private i64 @s1(i64** %globasp, i64 %arg) #0 {
%astack = alloca i64*, i64 10000
%asp.000 = getelementptr i64*, i64** %astack
%bstack = alloca i64, i64 10000
%bsp.000 = getelementptr i64, i64* %bstack, i64 9999
%bsp.001 = call i64* @pushI(i64 %arg, i64* %bsp.000)
; ...
```
This ensures that the B-stack is combined with the C-stack, and the `alloca`s
for subroutine-local stacks are optimized away by LLVM.
## Control flow
Example of a `jmp_false else.1` instruction, using `peek_b` and `pop_b1` from
the RTS:
```llvm
%t.0 = call i64 @peek_b(i64* %bsp.102)
%bsp.103 = call i64* @pop_b1(i64* %bsp.102)
%t.1 = trunc i64 %t.0 to i1
br i1 %t.1, label %l.0, label %else.1
```
Example of a `jsr s1` instruction with one B-stack argument and one B-stack
return value:
```llvm
%arg.0 = call i64 @peek_b(i64* %bsp.302)
%bsp.302.0 = call i64* @pop_b1(i64* %bsp.302)
%r.0 = call i64 @s1(i64** %globasp, i64 %arg.0)
%bsp.303 = call i64* @pushI(i64 %r.0, i64* %bsp.302.0)
```
|