Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial ProducersSection.md #65

Merged
merged 17 commits into from
Nov 16, 2018
109 changes: 109 additions & 0 deletions ProducersSection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Producers Section

The purpose of the producers section is to provide an optional,
highly-structured record of all the distinct tools that were used to produce
a given WebAssembly module. A primary purpose of this record is to allow
broad analysis of toolchain usage in the wild, which can help inform both wasm
producers and consumers.

The producers section is a
[custom section](https://webassembly.github.io/spec/core/binary/modules.html#custom-section)
and thus has no semantic effects and can be stripped at any time.
Since the producers section is relatively small, tools are encouraged to emit
the section or include themselves in an existing section by default, keeping
the producers section even in release builds.

WebAssembly consumers should avoid using the producers section to derive
optimization hints. To ensure portable performance, hints should be
standardized in a separate custom section, probably in the core spec's
[Custom Sections appendix](https://webassembly.github.io/spec/core/appendix/custom.html).

An additional goal of the producers section is to provide a discrete, but
easily-growable [list of known tools/languages](#known-list) for each
record field. This avoids the skew that otherwise happens with unstructured
strings. Unknown names do not invalidate an otherwise-valid producers section.
However, wasm consumers may provide less accurate telemetry results for unknown
names or even emit diagnostics encouraging the name to be put on the known list.

Since version information is useful, but highly-variable, each field value
is accompanied with a version string so that the name can remain stable
over time without requiring frequent updates to the known list.

## Custom Section

Custom section `name` field: `producers`

The producers section may appear only once, and only after the
[Name section](https://webassembly.github.io/spec/core/appendix/custom.html#name-section).

The producers section contains a sequence of fields with unique names, where the
end of the last field must coincide with the last byte of the producers section:

| Field | Type | Description |
| ----------- | ----------- | ----------- |
| field_count | `varuint32` | number of fields that follow |
| fields | `field*` | sequence of field_count `field` records |

where a `field` is encoded as:

| Field | Type | Description |
| ----------------- | ---- | ----------- |
| field_name | [name][name-ref] | name of this field |
| field_value_count | `varuint32` | number of value strings that follow |
| field_values | `versioned-name*` | sequence of field_value_count name-value pairs |

where a `versioned-name` is encoded as:

| Field | Type | Description |
| ------- | ---- | ----------- |
| name | [name][name-ref] | name of the language/tool |
| version | [name][name-ref] | version of the language/tool |

with the additional constraint that each field_name in the list must be unique
and found in the first column of the following table, and each of a given field_name's
field_values's name strings must be unique and found in the second column of
the field_name's row.

| field_name | field_value name strings |
| -------------- | -------------------- |
| `language` | [source language list](#source-languages) |
| `processed-by` | [individual tool list](#individual-tools) |
| `sdk` | [SDK list](#sdks) |

[name-ref]: https://webassembly.github.io/spec/core/binary/values.html#names

## Known list

The following lists contain all the known names for the fields listed above.
**If your tool is not on this list and you'd like it to be, please submit a PR.**

### Source Languages

It is possible for multiple source languages to be present in a single module
when the output of multiple compiled languages are statically linked together.

* `wat`
* `C`
* `C++`

### Individual Tools

It is possible (and common) for multiple tools to be used in the overall
pipeline that produces and optimizes a given wasm module.

* `wabt`
* `LLVM`
* `lld`
* `Binaryen`

### SDKs

While an SDK is technically just another tool, the `sdk` field designates the
top-level "thing" that the developer installs and interacts with directly to
produce the wasm module.

* `Emscripten`

lukewagner marked this conversation as resolved.
Show resolved Hide resolved
## Text format

TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any advantage of using a syntax for that? I think that when custom sections are available in wast it will be easy to declare the producer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well even when custom sections are in wast, you'd still have to write out the encoded binary, which seems unpleasant to read or write. For example, if you look at a wasm module in the browser debugger, it'd be nice if you simply saw the toolchain.