summaryrefslogtreecommitdiffhomepage
path: root/resources/md/2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.md
diff options
context:
space:
mode:
Diffstat (limited to 'resources/md/2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.md')
-rw-r--r--resources/md/2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.md193
1 files changed, 193 insertions, 0 deletions
diff --git a/resources/md/2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.md b/resources/md/2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.md
new file mode 100644
index 0000000..0244507
--- /dev/null
+++ b/resources/md/2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.md
@@ -0,0 +1,193 @@
+*This is part 3 in a series on running the [Clean][] compiler in
+[WebAssembly][], with the proof of concept in the [Clean Sandbox][]. In this
+part I discuss how all the individual components are integrated. See the
+[introduction][] for a high-level overview. In the previous part, I discussed
+[the compilation pipeline][pipeline].*
+
+[[toc]]
+
+# Overview
+
+We have to put all our different elements together in JavaScript. In particular
+we have the following elements:
+
+- The C tools (compiler backend, bytecode generator, bytecode linker, and
+ bytecode prelinker)
+- The WebAssembly interpreters in which the Clean tools run (compiler frontend,
+ `make`-like program)
+- The WebAssembly interpreter for the compiled program itself.
+
+# Linking the compiler frontend and backend
+
+We have already seen how [the `make` tool communicates with the C tools and
+with the compiler][pipeline]. It is a little trickier to set up the
+communication between the compiler frontend and the backend. This is because
+the `make` tool was written for this application specifically, while the
+compiler wasn't.
+
+In Clean, the foreign function interface works through a special ABC
+instruction, `ccall`. For example, if we have a C function `int c_add(int,
+int)`, we can define `add` below. `II:I` here stands for the type: two integer
+arguments and an integer return type.
+
+```clean
+add :: !Int !Int -> Int
+add _ _ = code {
+ ccall c_add "II:I"
+}
+```
+
+This instruction is not supported by the interpreter, because there is no way
+to implement it generically in that context. When interpretation reaches a
+`ccall` (or any other unimplemented instruction), the WebAssembly interpreter
+calls a JavaScript function that can try to handle it. This function can read
+and modify the state of the ABC machine, and return to it after it has handled
+the instruction. This allows us to link the compiler frontend to the backend in
+JavaScript:
+
+```js
+var clean_compiler;
+function create_clean_compiler () {
+ return ABCInterpreter.instantiate({
+ interpreter_imports: {
+ handle_illegal_instr: (pc, instr, asp, bsp, csp, hp, hp_free) => {
+ instr = ABCInterpreter.instructions[instr];
+ if (instr == 'ccall')
+ return i_ccall.bind(c_compiler)(clean_compiler, pc, asp, bsp);
+ else
+ return 0;
+ },
+ /* other options .. */
+ }
+ }).then(abc => {
+ clean_compiler = abc;
+ });
+}
+```
+
+With `i_ccall` we can link a WebAssembly interpreter to an Emscripten module,
+which contains the global C functions:
+
+```js
+function i_ccall (abc, pc, asp, bsp) {
+ const fun = '_' + abc.get_clean_string(abc.memory_array[pc/4+2]-8); /* e.g. c_add; emscripten adds an underscore */
+ var type = abc.get_clean_string (abc.memory_array[pc/4+4]-8); /* e.g. II:I */
+
+ if (!(fun in this))
+ throw 'ccall: unknown function '+fun;
+
+ /* parse type; get arguments from the Clean heap and stack .. */
+
+ const result = this[fun].apply(null, args);
+
+ /* copy the result to the WebAssembly interpreter .. */
+
+ return pc+24; /* return the address of the next instruction */
+}
+```
+
+# Clean file I/O
+
+We have a similar problem with Clean programs doing file I/O. The compiler
+frontend uses file I/O, but for this it needs ABC instructions that are not
+implemented in the WebAssembly interpreter. We extend `handle_illegal_instr` to
+also catch these instructions, and use Emscripten's `FS` library to implement
+them. For example, `writeFS` writes a string to a file. `handle_illegal_instr`
+becomes:
+
+```js
+handle_illegal_instr: (pc, instr, asp, bsp, csp, hp, hp_free) => {
+ instr = ABCInterpreter.instructions[instr];
+ if (instr == 'ccall')
+ return i_ccall.bind(c_compiler)(clean_compiler, pc, asp, bsp);
+ else if (instr == 'writeFS')
+ return i_writeFS.bind(c_compiler)(clean_compiler, pc, asp, bsp);
+ else
+ return 0;
+},
+```
+
+And again `i_witeFS` is defined to link an arbitrary WebAssembly interpreter
+and Emscripten module together:
+
+```js
+
+function i_writeFS (abc, pc, asp, bsp) {
+ const i = abc.memory_array[bsp/4+2];
+
+ /* get arguments from the stack */
+ const s_ptr = abc.memory_array[asp/4];
+ const size = abc.memory_array[s_ptr/4+2];
+ const s = new Uint8Array(abc.memory_array.buffer, s_ptr+16, size);
+
+ this.FS.write(abc.files[i].stream, s, 0, size);
+
+ abc.interpreter.instance.exports.set_asp(asp-8); /* pop the string from the stack */
+ return pc+8; /* return the address of the next instruction */
+}
+```
+
+In reality we need about 10 instructions for file I/O, but the others are
+similar.
+
+# Running the generated bytecode
+
+We can now run a Clean program from our editor through the [compilation
+pipeline][pipeline] and generate a prelinked bytecode file. This is similar to
+a native executable, but for the WebAssembly interpreter. What remains is
+actually running the bytecode.
+
+We do this by simply creating a new instance of the ABC interpreter. The
+`instantiate` function uses `fetch()` to get supporting WebAssembly and
+bytecode files. In this case, our bytecode comes from the local file system, so
+we monkey-patch `fetch()` to supply the bytecode from the Emscripten file
+system (the case that the path is `null`):
+
+```js
+function run (path) {
+ const opts = {
+ fetch: (p) => {
+ return p !== null ? fetch(p) : new Promise(resolve => {
+ const pbc = c_bcprelink.FS.readFile(path);
+ resolve({
+ ok: true,
+ arrayBuffer: () => pbc.buffer
+ });
+ });
+ },
+ };
+
+ return ABCInterpreter.instantiate(opts).then(abc => {
+ abc.interpreter.instance.exports.set_pc(abc.start);
+ var r = 0;
+ try {
+ r = abc.interpreter.instance.exports.interpret();
+ } catch (e) {
+ sandbox.stderr(e+'\n');
+ r = -1;
+ }
+
+ /* flush output buffer */
+ if (sandbox.stdout_buffer.length>0)
+ sandbox.stdout('\n');
+
+ if (r!=0)
+ sandbox.stderr('failed with return code '+r);
+ });
+}
+```
+
+# Wrapping up
+
+That's it! There are many other small tricks here and there, but these four
+posts covered the architecture and most interesting implementation details. You
+can try out the [Clean Sandbox][] to see it live, or check out the [source
+code](https://gitlab.com/camilstaps/clean-sandbox).
+
+[Clean]: http://clean.cs.ru.nl/
+[Clean Sandbox]: https://camilstaps.gitlab.io/clean-sandbox/
+[WebAssembly]: https://webassembly.org/
+
+[introduction]: 2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-1-introduction.html
+[pipeline]: 2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-2-the-pipeline.html
+[integration]: 2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.html