1 files changed, 193 insertions, 0 deletions
diff --git a/resources/md/2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.md b/resources/md/2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.md
new file mode 100644
index 0000000..0244507
--- /dev/null
+++ b/resources/md/2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.md
@@ -0,0 +1,193 @@
+*This is part 3 in a series on running the [Clean][] compiler in
+[WebAssembly][], with the proof of concept in the [Clean Sandbox][]. In this
+part I discuss how all the individual components are integrated. See the
+[introduction][] for a high-level overview. In the previous part, I discussed
+[the compilation pipeline][pipeline].*
+
+[[toc]]
+
+# Overview
+
+We have to put all our different elements together in JavaScript. In particular
+we have the following elements:
+
+- The C tools (compiler backend, bytecode generator, bytecode linker, and
+  bytecode prelinker)
+- The WebAssembly interpreters in which the Clean tools run (compiler frontend,
+  `make`-like program)
+- The WebAssembly interpreter for the compiled program itself.
+
+# Linking the compiler frontend and backend
+
+We have already seen how [the `make` tool communicates with the C tools and
+with the compiler][pipeline]. It is a little trickier to set up the
+communication between the compiler frontend and the backend. This is because
+the `make` tool was written for this application specifically, while the
+compiler wasn't.
+
+In Clean, the foreign function interface works through a special ABC
+instruction, `ccall`. For example, if we have a C function `int c_add(int,
+int)`, we can define `add` below. `II:I` here stands for the type: two integer
+arguments and an integer return type.
+
+```clean
+add :: !Int !Int -> Int
+add _ _ = code {
+	ccall c_add "II:I"
+}
+```
+
+This instruction is not supported by the interpreter, because there is no way
+to implement it generically in that context. When interpretation reaches a
+`ccall` (or any other unimplemented instruction), the WebAssembly interpreter
+calls a JavaScript function that can try to handle it. This function can read
+and modify the state of the ABC machine, and return to it after it has handled
+the instruction. This allows us to link the compiler frontend to the backend in
+JavaScript:
+
+```js
+var clean_compiler;
+function create_clean_compiler () {
+	return ABCInterpreter.instantiate({
+		interpreter_imports: {
+			handle_illegal_instr: (pc, instr, asp, bsp, csp, hp, hp_free) => {
+				instr = ABCInterpreter.instructions[instr];
+				if (instr == 'ccall')
+					return i_ccall.bind(c_compiler)(clean_compiler, pc, asp, bsp);
+				else
+					return 0;
+			},
+			/* other options .. */
+		}
+	}).then(abc => {
+		clean_compiler = abc;
+	});
+}
+```
+
+With `i_ccall` we can link a WebAssembly interpreter to an Emscripten module,
+which contains the global C functions:
+
+```js
+function i_ccall (abc, pc, asp, bsp) {
+	const fun = '_' + abc.get_clean_string(abc.memory_array[pc/4+2]-8); /* e.g. c_add; emscripten adds an underscore */
+	var type = abc.get_clean_string (abc.memory_array[pc/4+4]-8); /* e.g. II:I */
+
+	if (!(fun in this))
+		throw 'ccall: unknown function '+fun;
+
+	/* parse type; get arguments from the Clean heap and stack .. */
+
+	const result = this[fun].apply(null, args);
+
+	/* copy the result to the WebAssembly interpreter .. */
+
+	return pc+24; /* return the address of the next instruction */
+}
+```
+
+# Clean file I/O
+
+We have a similar problem with Clean programs doing file I/O. The compiler
+frontend uses file I/O, but for this it needs ABC instructions that are not
+implemented in the WebAssembly interpreter. We extend `handle_illegal_instr` to
+also catch these instructions, and use Emscripten's `FS` library to implement
+them. For example, `writeFS` writes a string to a file. `handle_illegal_instr`
+becomes:
+
+```js
+handle_illegal_instr: (pc, instr, asp, bsp, csp, hp, hp_free) => {
+	instr = ABCInterpreter.instructions[instr];
+	if (instr == 'ccall')
+		return i_ccall.bind(c_compiler)(clean_compiler, pc, asp, bsp);
+	else if (instr == 'writeFS')
+		return i_writeFS.bind(c_compiler)(clean_compiler, pc, asp, bsp);
+	else
+		return 0;
+},
+```
+
+And again `i_witeFS` is defined to link an arbitrary WebAssembly interpreter
+and Emscripten module together:
+
+```js
+
+function i_writeFS (abc, pc, asp, bsp) {
+	const i = abc.memory_array[bsp/4+2];
+
+	/* get arguments from the stack */
+	const s_ptr = abc.memory_array[asp/4];
+	const size = abc.memory_array[s_ptr/4+2];
+	const s = new Uint8Array(abc.memory_array.buffer, s_ptr+16, size);
+
+	this.FS.write(abc.files[i].stream, s, 0, size);
+
+	abc.interpreter.instance.exports.set_asp(asp-8); /* pop the string from the stack */
+	return pc+8; /* return the address of the next instruction */
+}
+```
+
+In reality we need about 10 instructions for file I/O, but the others are
+similar.
+
+# Running the generated bytecode
+
+We can now run a Clean program from our editor through the [compilation
+pipeline][pipeline] and generate a prelinked bytecode file. This is similar to
+a native executable, but for the WebAssembly interpreter. What remains is
+actually running the bytecode.
+
+We do this by simply creating a new instance of the ABC interpreter. The
+`instantiate` function uses `fetch()` to get supporting WebAssembly and
+bytecode files. In this case, our bytecode comes from the local file system, so
+we monkey-patch `fetch()` to supply the bytecode from the Emscripten file
+system (the case that the path is `null`):
+
+```js
+function run (path) {
+	const opts = {
+		fetch: (p) => {
+			return p !== null ? fetch(p) : new Promise(resolve => {
+				const pbc = c_bcprelink.FS.readFile(path);
+				resolve({
+					ok: true,
+					arrayBuffer: () => pbc.buffer
+				});
+			});
+		},
+	};
+
+	return ABCInterpreter.instantiate(opts).then(abc => {
+		abc.interpreter.instance.exports.set_pc(abc.start);
+		var r = 0;
+		try {
+			r = abc.interpreter.instance.exports.interpret();
+		} catch (e) {
+			sandbox.stderr(e+'\n');
+			r = -1;
+		}
+
+		/* flush output buffer */
+		if (sandbox.stdout_buffer.length>0)
+			sandbox.stdout('\n');
+
+		if (r!=0)
+			sandbox.stderr('failed with return code '+r);
+	});
+}
+```
+
+# Wrapping up
+
+That's it! There are many other small tricks here and there, but these four
+posts covered the architecture and most interesting implementation details. You
+can try out the [Clean Sandbox][] to see it live, or check out the [source
+code](https://gitlab.com/camilstaps/clean-sandbox).
+
+[Clean]: http://clean.cs.ru.nl/
+[Clean Sandbox]: https://camilstaps.gitlab.io/clean-sandbox/
+[WebAssembly]: https://webassembly.org/
+
+[introduction]: 2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-1-introduction.html
+[pipeline]: 2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-2-the-pipeline.html
+[integration]: 2021-06-23-compiling-clean-in-the-browser-with-webassembly-part-3-putting-it-all-together.html