This is a collection of links and ideas and notes on compiling OCaml to WASM.
From http://caml.inria.fr/pub/ml-archives/caml-list/2009/03/3a77bfcca0f90b763d127d1581d6a2f1.en.html (Xavier Leroy, 2009):
3- A language implementation like OCaml breaks down in four big parts: 1- Front-end compiler 2- Back-end compiler and code emitter 3- Run-time system 4- OS interface [...] 6- Here is a schematic of the Caml compiler. (Use a fixed-width font.) | | parsing and preprocessing v Parsetree (untyped AST) | | type inference and checking v Typedtree (type-annotated AST) | | pattern-matching compilation, elimination of modules, classes v Lambda / \ / \ closure conversion, inlining, uncurrying, v \ data representation strategy Bytecode \ \ Cmm | | code generation v Assembly code
A more recent high-level view of the compilation pipeline (from https://ocamllabs.slack.com/archives/C0JCHGE78/p1568626615023800, Sep 16, 2019):
Source code | | parsing v Parsetree | | typing v Typedtree | | desugar pattern matching, modules, objects, etc; erase types, | make explicit memory layout in terms of blocks and values | v Lambda (higher order lambda calculus based IR) | | make closure construction and usage explicit | perform inlining | v Clambda (like Lambda but with explicit closures, direct/indirect calls) | | make block/value manipulation explicit | make allocation explicit | v Cmm (tree-structured, explicit memory manipulation, C calls, etc) | | perform instruction selection, | sequentialization into basic blocks, | assignment of pseudo-registers | v Mach (block structured IR) | | liveness, register allocation, dead code elimination | are Mach -> Mach transformations | v Linear (linear sequence of abstract assembly instructions, explicit register assignments) | | this step is heavily backend-dependent, implemented in `emit.mlp` | v Textual assembly code
Both OCaml bytecode and OCaml native code comes with a runtime that provides functions needed to run the compiled program.
- The runtime provides
-
-
a copying garbage collector, the
caml_alloc
-functions -
the
caml_apply
-functions that implement curried function application -
binary and unary operations on tagged integers/floats
-
exception handling
-
TODO: find out what else the runtime does.
- Dealing with OCaml values' lifetimes in WASM
-
-
gc1) "Manual" garbage collection on the WASM linear memory, by maintaining a stack in WASM linear memory and porting the garbage collector to WASM
This is apparently what the WASM backend of the Go language does.
-
gc2) Use the WASM garbage collector extension to allocate and manage OCaml values
The WASM garbage collector specification is still at a very early stage, and it is not clear how exactly it will work. The only reasonable way, at this point in time, to attempt this is to prototype our own WASM GC extension.
What if, while browser support is not there yet, we could compile to WASM+GC and then compile WASM+GC to WASM?
-
gc3) Allocate heap objects on "the JavaScript side of the world" via Reference Types
This is more or less a work-around for not having a WASM GC extension.
-
gc4) Create a version of OCaml that has static lifetimes, similar to Rust.
I will not do this, since pretty much all existing OCaml code would need to be rewritten in order to be compiled to WASM with this variant of the language. Also, this may be different enough to OCaml that this is essentially a new programming language.
-
- Direct
-
-
1a) translate Lambda → WASM
-
1b) translate Cmm → WASM
-
1c) translate OCaml bytecode → WASM
-
1d) run a bytecode interpreter for OCaml in WASM
-
- Indirect
-
-
2a) Cmm → LLVM → WASM
-
2b) OCaml bytecode → LLVM → WASM
-
2c) Ocaml → machine code → WASM
-
While there are currently no projects that translate OCaml’s lambda IR to WASM, there are these:
-
[production-ready] Bucklescript (Ocaml rawlambda) → JavaScript: https://github.com/BuckleScript/bucklescript
This may or may not be helpful, I do not know.
"Js_of_ocaml focuses more on existing OCaml ecosystem(opam) while BuckleScript’s major goal is to target npm"
"Js_of_ocaml and BuckleScript have slightly different runtime encoding in several places, for example, BuckleScript encodes OCaml Array as JS Array while js_of_ocaml requires its index 0 to be of value 0."
Overview of the bucklescript compiler: https://github.com/BuckleScript/bucklescript/blob/00ad78cbcfd1132d3a5931fe760706de35e480f6/site/docsource/Compiler-overview.adoc
git compare of the bucklescript fork of the ocaml compiler to the official ocaml compiler: https://github.com/ocaml/ocaml/compare/4.02…BuckleScript:4.02.3+BS
-
the Grain Language → WASM https://github.com/grain-lang/grain
Even though the source language used here is not OCaml, there might be some interesting observations in here about compiling a functional language to WASM.
"Low-level IR, suitable for direct translation into WASM": https://github.com/grain-lang/grain/blob/78dc08b2887226cf0b9f93357ca6fd689fcd1405/src/codegen/mashtree.ml
Starting from an already optimized version of the program is likely to result in a comparatively fast execution speed.
Generally, it appears that Cmm is a good starting point when compiling to WASM without using the WASM GC extension, since the memory representation has already been flattened at the Cmm stage.
-
[abandoned] https://github.com/rolph-recto/ocaml-wasm/tree/wasm/wasmcomp
"first working version: compiles arith exprs only", latest commit Jul 27, 2018
-
[WIP] Ocaml Cmm → WASM https://github.com/SanderSpies/ocaml/tree/manual_gc/asmcomp/wasm32
Experiments on GC: https://github.com/SanderSpies/ocaml-wasm-gc-experimenting
I seems that this is based on the official WASM specification
ast.ml
, but copied and modified to use a symbol type, instead of string for function and variable identifiers: https://github.com/SanderSpies/ocaml/commit/60a0d4218b34a0ace29a39e925c12cb5a76a3c55It also looks like there is a few (commmented-out) lines added for the upcoming WASM exception-handling feature.
-
[WIP] Haskell Cmm → WASM https://github.com/tweag/asterius
"we implement the cmm-to-wasm code generator as yet another native backend, and any non-Haskell logic of the runtime is hand-written WebAssembly code, which means we’re simulating various rts interfaces to the degree that a significant portion of vanilla Haskell code becomes runnable." (see here)
Garbage collection: tweag/asterius#52
I am not aware of any projects that attempt translating from OCaml bytecode to WASM. Please let me know if you are.
An advantage is that the bytecode interpreter hardly ever changes at all (it is said to still be quite similar to what is laid out in the original report on ZINC).
There is no dependency on compiler internals, as we can work on the bytecode output of ocamlc
.
In the past, translating bytecode has proven to be a successful and maintainable strategy for compiling OCaml to different languages:
-
[production-ready] OCaml bytecode → JavaScript: https://github.com/ocsigen/js_of_ocaml
https://www.irif.fr/~balat/publications/vouillon_balat-js_of_ocaml.pdf presents performance results from 2011: The code generated by
js_of_ocaml
running on the V8 JavaScript engine was faster than running the bytecode interpreter on the bytecode generated byocamlc
, and slower than the machine code generated byocamlopt
. Exceptions turned out to be very expensive.js_of_ocaml
is being used in production systems, as far as I know, it is currently the best tool to compile OCaml to JavaScript.OCaml values are allocated on the JavaScript heap (gc3), thus, the calls to the garbage collector are just stubs: https://github.com/ocsigen/js_of_ocaml/blob/e7a34b8e0697a34b235ff121132c72121c16798d/runtime/gc.js
Note: It is unlikely, that exceptions will be an issue when compiling to WASM, since the exception mechanism in WASM will be different from the one in JavaScript.
-
[inactive] Ocaml bytecode → C: https://github.com/bvaugon/ocamlcc
According to http://michel.mauny.net/data/papers/mauny-vaugon-ocamlcc-oud2012.pdf, performance in 2012 was better than running the bytecode interpreter, and worse than running the machine code generated by
ocamlopt
, which essentially was to be expected. However, this comes at the cost of having large executables, roughly up to twice the size of machine code in the considered examples.I managed to compile this using an older version of the OCaml compiler.
I can compile trivial test programs to C.
Compiling that C code using Emscripten to WASM, I am stuck with this error on the JavaScript console:
exception thrown: RuntimeError: index out of bounds,_caml_page_table_modify@http://127.0.0.1:8000/output.js:45026:1 _caml_page_table_add@http://127.0.0.1:8000/output.js:44203:1 _caml_set_minor_heap_size@http://127.0.0.1:8000/output.js:89253:1 _caml_init_gc@http://127.0.0.1:8000/output.js:90849:1 _caml_main@http://127.0.0.1:8000/output.js:99291:1 _main@http://127.0.0.1:8000/output.js:110038:1 Module._main@http://127.0.0.1:8000/output.js:6717:10 callMain@http://127.0.0.1:8000/output.js:7005:15 doRun@http://127.0.0.1:8000/output.js:7064:23 run/<@http://127.0.0.1:8000/output.js:7075:7
I’m having trouble debugging this because I don’t have source maps for the C files where the
_caml_
-functions come from. The reason seems to be that the files aren’t actually included, only the headers. So I need to figure out what parameters to provide to emcc. In order to do that, I need to figure out what parameters ocamlcc uses to compile the code with gcc.I was able to get the parameters from ocamlcc by using the -verbose option, now the error is this:
shared:ERROR: emcc: cannot find library "curses"
While I could continue here, I think that this is a dead end due to the large code size.
-
[inactive] https://github.com/sebmarkbage/ocamlrun-wasm
sebmarkbage compiled the OCaml bytecode interpreter, as well as the GC to WASM using emscripten. Latest commit Mar 6, 2017
I tried to compile this, but am stuck at the problem described in Issue 1
-
https://github.com/vincentdchan/ocaml
same principle as sebmarkbage’s approach, but adds support of libraries such as Unix library, Ctypes, Base, Core_kernel
If there was a compiler from OCaml to LLVM, it would immediately enable compilation to WASM.
-
[abandoned] Cmm → LLVM https://github.com/whitequark/ocaml-llvm-ng/blob/master/lib/llvmcomp.ml
-
[abandoned] OCaml bytecode → LLVM https://github.com/raph-amiard/CamllVM
"TLDR : In the end it is just not worth it to optimize this project for performance. A better approach would be to start from scratch and do a real OCaml → LLVM compiler for ocamlopt, that would be able to use the full AST with type information." https://news.ycombinator.com/item?id=4798320
For compiling machine code to WASM, there apparently do not currently exist any solutions.
One would need to apply some kind of algorithm that transforms the control flow from a program-counter-based representation to the labeled continuations that can be seen in WASM, just like Emscripten’s "Relooper" algorithm does for LLVM.
If there is an architecture whose machine code can be translated to WASM in a reasonably efficient fashion, and it turns out that OCaml already compiles to this architecture, this could be interesting.
If successful, this could, in the long run, help getting many other languages to compile to WASM as well.
-
wasm - https://opam.ocaml.org/packages/wasm/
"An OCaml library to read and write Web Assembly (wasm) files and manipulate their AST."
-
"Malfunctional Programming" (the author implemented an interpreter for lambda, which should be similar to one for Cmm) https://www.cl.cam.ac.uk/~sd601/papers/malfunction.pdf
-
"Caml Virtual Machine - Instruction set Document version: 1.4" http://cadmium.x9c.fr/distrib/caml-instructions.pdf
description of compiler version 3.11.2’s bytecode
-
How to write programs that never allocate http://www.ocamlpro.com/2016/04/01/asm-ocaml/
-
loading
Cmm
in the interpreter:#load "compiler-libs/ocamlcommon.cma";; #load "compiler-libs/ocamloptcomp.cma";; #require "compiler-libs.optcomp";; #show_module Cmm;;
-
A Scheme to WASM compiler - https://github.com/google/schism
Schism allocating values on the JS side: WebAssembly/tool-conventions#122 (comment)
-
WASM tail calls proposal - https://github.com/WebAssembly/tail-call/blob/125201ced9a0f158553d08d8a20b7152f3057367/proposals/tail-call/Overview.md
-
CakeML a verified ML compiler - https://cakeml.org/
Contains formal semantics for different intermediate representations in their compilation pipeline.
"A New Verified Compiler Backend for CakeML" (presentation of the overall structure and the intermediate languages of the CakeML compiler) https://cakeml.org/icfp16.pdf
CakeML to WebAssembly talk https://lorenz.leutgeb.xyz/paper/cakeml-wasm-viennajs-beam.pdf
-
How to Represent Elm functions in Web Assembly https://dev.to/briancarroll/elm-functions-in-webassembly-50ak
-
Solving the structured control flow problem (Improving on the Relooper algorithm) https://medium.com/leaningtech/solving-the-structured-control-flow-problem-once-and-for-all-5123117b1ee2
-
there is a parser for
.cmm
files intestsuite/tools/parsecmm.mly
of the OCaml compiler -
it is possible to generate a
.merlin
file for the compiler by, starting from a clean repository, running./configure
anddune build @world
. Even if the full build fails, the.merlin
file will likely be there and useful. -
How to explore Cmm semantics by looking at the generated x86 code (useful if you feel comfortable with assembly code):
ocamlopt test2.ml -S -inline 0 -nodynlink ocamlopt test2.ml -dcmm
One can then look at the Cmm and the generated
test2.s
side-by-side.When compiling on an x86-64 Intel machine using Debian, one can look at the runtime functions in https://github.com/ocaml/ocaml/blob/5ad64306d36755b600f2556805effb73627508c8/runtime/amd64.S.
-
Talk on Rust, WebAssembly and JavaScript by Ashley Williams https://www.infoq.com/presentations/rust-webassembly-javascript/?utm_source=youtube&utm_medium=link&utm_campaign=qcontalks
-
IR for compiling WASM to Cmm https://github.com/SimonJF/cmm_of_wasm/blob/10fea570f80f91ee26c23a6cca48b29795c9967b/src/lib/ir/annotated.ml
-
Mozilla Garbage collector Info https://github.com/lars-t-hansen/moz-gc-experiments/
Some tests related to
anyref
and garbage collector: https://hg.mozilla.org/mozilla-central/file/tip/js/src/jit-test/tests/wasm/gc/ -
Web assembly meetings repository (last video call from July 23rd) https://github.com/WebAssembly/meetings
-
GC without a shadow stack in an "uncooperative environment" WebAssembly/gc#36 (comment)
As noted in a later comment, this technique becomes even more complex when the stack consists of both JavaScript and WASM frames. It doesn’t look like a good idea to do that due to the increase in code size.
This later comment WebAssembly/gc#36 (comment) proposes "making the execution stack visible" or implementing "an efficient user stack with a defined layout, including what gets put there when a function is called. Have a second instruction set that works with this stack: us.i32.const 50, us.i32.mul, etc".
If we had any possibility to inspect the WASM stack to discover the gc roots, this would be nice. However, in order for the OCaml GC to move blocks, we also need the ability to modify. A generic read/write feature for the WASM stack is not desirable due to security implications.
-
Paper: Accurate Garbage Collection in an Uncooperative Environment. Fergus Henderson. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.87.3769&rep=rep1&type=pdf
-
Paper: Accurate Garbage Collection in Uncooperative Environments Revisited. J. Baker, A. Cunei, T. Kalibera, F. Pizlo, J. Vitek. http://www.filpizlo.com/papers/baker-ccpe09-accurate.pdf
-
sharing JS data structures with WASM in the absence of reference types WebAssembly/interface-types#18 (comment)
Inefficient but interesting workaround to use browser GC before a WASM GC appears: a) keep an Object that maps references (keys of the JavaScript Object) to arrays (representing the OCaml heap blocks), b) allocate by adding a new key to the Object, c) deallocate by deleting a key from the Object, d) provide functions to write/read the heap blocks from WASM. Clearly, the runtime performance here depends on: x) how fast function calls from WASM to JS are, y) how fast Object and array access is in JS. Note: if we had a way to read the WASM stack, we know the roots, and we can "deallocate" by removing unreachable keys from the Object. The actual garbage collection is done by the JavaScript GC. In WASM we only keep immutable handles to the data (the keys to the JS Object).
Performance will be worse than using a shadow stack with the OCaml gc compiled to WASM on the WASM linear memory. In order to be able to refer to any heap block by a reference, data will be "flattened" in the sense that every key maps to an array consisting of only numbers (primitives) and strings (references into the JS Object).
In this comment, the same approach is mentioned: https://www.reddit.com/r/ProgrammingLanguages/comments/bbhz69/wasm_reftypes_are_enabled_by_default_in_firefox/ekj4mhr/
With reftypes, the technique of using JavaScript values to represent OCaml heap blocks is simplified by not having to take the additional indirection through a JS Object - the references to JS data can be passed to WASM directly (which, I think, implies that the Browser GC is responsible for modifying moved references and for discovering its own GC roots). To model pointer values in OCaml heap blocks correctly, one would need to embed the data pointed to in the JavaScript array. So, we would have Arrays of numbers and arrays in JavaScript to model OCaml heap blocks. Performance now largely depends on the cost of array access and on the cost of function calls between WASM and JS.
Note that we don’t ever need to inspect the WASM stack, we can store heap references in WASM local variables, and we don’t need to keep a shadow stack, so the cost tradeoff would be between a) maintaining a shadow stack with OCaml’s GC and b) modeling OCaml heap blocks via reftypes and JS arrays accessed by function calls with the JavaScript GC. Code size is likely smaller with the JS approach.
-
Note: We could avoid using local variables for OCaml values in WASM and instead use global variables as "registers". This means that compilation to WASM would not be very different from compilation to machine code (the main difference would be how control flow is compiled / register allocation works exactly as in compilation to native). However, currently this does not work since it is not possible to import/export global variables. See here: https://github.com/WebAssembly/mutable-global/blob/89e5be9d69f2afac7243b6d2ff36b9c8723efb77/proposals/mutable-global/Overview.md
-
WASM future features list https://github.com/WebAssembly/design/blob/master/FutureFeatures.md
-
LLVM WASM backend https://github.com/llvm/llvm-project/tree/6088f84398847152ad97eb1bc0b139a28e879b48/llvm/lib/Target/WebAssembly
"After register stackification and register coloring, convert non-stackified registers into locals, inserting explicit local.get and local.set instructions." https://github.com/llvm/llvm-project/blob/6088f84398847152ad97eb1bc0b139a28e879b48/llvm/lib/Target/WebAssembly/WebAssemblyExplicitLocals.cpp
-
ocaml/runtime/globroots.c
I currently believe that the purpose is that you can use the OCamlalloc
from other languages (e.g. C, Java) explicitly (so that you can deallocate eventually by removing from the global roots). This technique could be ported to WASM.ocaml/runtime/roots_nat.c
scans the stack for gc roots, which is something that cannot be ported as-is to WASM. -
An explanation of dynamic linking with LLD for WASM: https://iandouglasscott.com/2019/07/18/experimenting-with-webassembly-dynamic-linking-with-clang/
-
wasm-ld https://lld.llvm.org/WebAssembly.html
WASM tool conventions on linking https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md
-
Notes on the background of WASI https://github.com/CraneStation/wasmtime/blob/3ae7c60b1395fe25971f683828241e7fac8cb40b/docs/WASI-background.md
-
Planning of reference types implementation in the Lucet WASM compiler bytecodealliance/lucet#272
Using Lucet as a playground to prototype WASM implementations doesn’t seem a good idea.
-
Multi-core ocaml compiler Wiki https://github.com/ocaml-multicore/ocaml-multicore/wiki
Garbage collector invariants, possibly helpful https://github.com/ocaml-multicore/ocaml-multicore/wiki/Garbage-collector-invariants
-
Boxing vs. tagging in the OCaml compiler
Boxing is the act of allocating a heap block to store a primitive value. This is done by using the appropriate heap block header for the type of the primitive value. Unboxing is the reverse operation, taking a heap-allocated primitive and returning just the primitive value.
Tagging is done with integers everywhere, in order to be able to store them unboxed on the heap without confusing the garbage collector (who scans the heap for pointers and will confuse integers ending with 0 with pointers)
-
Wouter van Oortmerssen — Bring your language to Wasm! https://github.com/sabine/ocaml-to-wasm-overview.git
Hands-on intro to how to generate WASM code. Also brief discussion of language features and tools that can be used.
-
"Run OCaml in the browser by WebAssembly " https://okcdz.medium.com/run-ocaml-in-the-browser-by-webassembly-31ce464594c6
Running OCaml in the browser by compiling the OCaml bytecode interpreter to WASM. Adding support of libraries such as Unix library, Ctypes, Base, Core_kernel.
-
binaryen_dsl package on opam https://opam.ocaml.org/packages/binaryen_dsl/
wrapper around Binaryen that allows to generate WAT and WASM files from a formalization of the Binaryen DSL. Not sure how complete it is, as it is a very new package.