diff options
Diffstat (limited to 'resources/md/2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.md')
-rw-r--r-- | resources/md/2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.md | 193 |
1 files changed, 193 insertions, 0 deletions
diff --git a/resources/md/2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.md b/resources/md/2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.md new file mode 100644 index 0000000..0244507 --- /dev/null +++ b/resources/md/2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.md @@ -0,0 +1,193 @@ +*This is part 3 in a series on running the [Clean][] compiler in +[WebAssembly][], with the proof of concept in the [Clean Sandbox][]. In this +part I discuss how all the individual components are integrated. See the +[introduction][] for a high-level overview. In the previous part, I discussed +[the compilation pipeline][pipeline].* + +[[toc]] + +# Overview + +We have to put all our different elements together in JavaScript. In particular +we have the following elements: + +- The C tools (compiler backend, bytecode generator, bytecode linker, and + bytecode prelinker) +- The WebAssembly interpreters in which the Clean tools run (compiler frontend, + `make`-like program) +- The WebAssembly interpreter for the compiled program itself. + +# Linking the compiler frontend and backend + +We have already seen how [the `make` tool communicates with the C tools and +with the compiler][pipeline]. It is a little trickier to set up the +communication between the compiler frontend and the backend. This is because +the `make` tool was written for this application specifically, while the +compiler wasn't. + +In Clean, the foreign function interface works through a special ABC +instruction, `ccall`. For example, if we have a C function `int c_add(int, +int)`, we can define `add` below. `II:I` here stands for the type: two integer +arguments and an integer return type. + +```clean +add :: !Int !Int -> Int +add _ _ = code { + ccall c_add "II:I" +} +``` + +This instruction is not supported by the interpreter, because there is no way +to implement it generically in that context. When interpretation reaches a +`ccall` (or any other unimplemented instruction), the WebAssembly interpreter +calls a JavaScript function that can try to handle it. This function can read +and modify the state of the ABC machine, and return to it after it has handled +the instruction. This allows us to link the compiler frontend to the backend in +JavaScript: + +```js +var clean_compiler; +function create_clean_compiler () { + return ABCInterpreter.instantiate({ + interpreter_imports: { + handle_illegal_instr: (pc, instr, asp, bsp, csp, hp, hp_free) => { + instr = ABCInterpreter.instructions[instr]; + if (instr == 'ccall') + return i_ccall.bind(c_compiler)(clean_compiler, pc, asp, bsp); + else + return 0; + }, + /* other options .. */ + } + }).then(abc => { + clean_compiler = abc; + }); +} +``` + +With `i_ccall` we can link a WebAssembly interpreter to an Emscripten module, +which contains the global C functions: + +```js +function i_ccall (abc, pc, asp, bsp) { + const fun = '_' + abc.get_clean_string(abc.memory_array[pc/4+2]-8); /* e.g. c_add; emscripten adds an underscore */ + var type = abc.get_clean_string (abc.memory_array[pc/4+4]-8); /* e.g. II:I */ + + if (!(fun in this)) + throw 'ccall: unknown function '+fun; + + /* parse type; get arguments from the Clean heap and stack .. */ + + const result = this[fun].apply(null, args); + + /* copy the result to the WebAssembly interpreter .. */ + + return pc+24; /* return the address of the next instruction */ +} +``` + +# Clean file I/O + +We have a similar problem with Clean programs doing file I/O. The compiler +frontend uses file I/O, but for this it needs ABC instructions that are not +implemented in the WebAssembly interpreter. We extend `handle_illegal_instr` to +also catch these instructions, and use Emscripten's `FS` library to implement +them. For example, `writeFS` writes a string to a file. `handle_illegal_instr` +becomes: + +```js +handle_illegal_instr: (pc, instr, asp, bsp, csp, hp, hp_free) => { + instr = ABCInterpreter.instructions[instr]; + if (instr == 'ccall') + return i_ccall.bind(c_compiler)(clean_compiler, pc, asp, bsp); + else if (instr == 'writeFS') + return i_writeFS.bind(c_compiler)(clean_compiler, pc, asp, bsp); + else + return 0; +}, +``` + +And again `i_witeFS` is defined to link an arbitrary WebAssembly interpreter +and Emscripten module together: + +```js + +function i_writeFS (abc, pc, asp, bsp) { + const i = abc.memory_array[bsp/4+2]; + + /* get arguments from the stack */ + const s_ptr = abc.memory_array[asp/4]; + const size = abc.memory_array[s_ptr/4+2]; + const s = new Uint8Array(abc.memory_array.buffer, s_ptr+16, size); + + this.FS.write(abc.files[i].stream, s, 0, size); + + abc.interpreter.instance.exports.set_asp(asp-8); /* pop the string from the stack */ + return pc+8; /* return the address of the next instruction */ +} +``` + +In reality we need about 10 instructions for file I/O, but the others are +similar. + +# Running the generated bytecode + +We can now run a Clean program from our editor through the [compilation +pipeline][pipeline] and generate a prelinked bytecode file. This is similar to +a native executable, but for the WebAssembly interpreter. What remains is +actually running the bytecode. + +We do this by simply creating a new instance of the ABC interpreter. The +`instantiate` function uses `fetch()` to get supporting WebAssembly and +bytecode files. In this case, our bytecode comes from the local file system, so +we monkey-patch `fetch()` to supply the bytecode from the Emscripten file +system (the case that the path is `null`): + +```js +function run (path) { + const opts = { + fetch: (p) => { + return p !== null ? fetch(p) : new Promise(resolve => { + const pbc = c_bcprelink.FS.readFile(path); + resolve({ + ok: true, + arrayBuffer: () => pbc.buffer + }); + }); + }, + }; + + return ABCInterpreter.instantiate(opts).then(abc => { + abc.interpreter.instance.exports.set_pc(abc.start); + var r = 0; + try { + r = abc.interpreter.instance.exports.interpret(); + } catch (e) { + sandbox.stderr(e+'\n'); + r = -1; + } + + /* flush output buffer */ + if (sandbox.stdout_buffer.length>0) + sandbox.stdout('\n'); + + if (r!=0) + sandbox.stderr('failed with return code '+r); + }); +} +``` + +# Wrapping up + +That's it! There are many other small tricks here and there, but these four +posts covered the architecture and most interesting implementation details. You +can try out the [Clean Sandbox][] to see it live, or check out the [source +code](https://gitlab.com/camilstaps/clean-sandbox). + +[Clean]: http://clean.cs.ru.nl/ +[Clean Sandbox]: https://camilstaps.gitlab.io/clean-sandbox/ +[WebAssembly]: https://webassembly.org/ + +[introduction]: 2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-1-introduction.html +[pipeline]: 2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-2-the-pipeline.html +[integration]: 2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.html |