-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benefits of non-nullable references? #40
Comments
Two benefits I can see.
|
About 1, which engines do you mean? It seems like every time you create a struct, you would zero-initialize the 2 is an interesting point I hadn't considered! Yes, if we allow catching traps in the future it seems like it could help. |
|
This usually only happens for a per-page allocation during |
@skuzmich Thanks, yes, I think you're right. So like with linear memory, null checks can be "free" with the signal handler trick, and non-nullable references can't speed up GC references either. @MaxGraey You'd only need to guard the region around 0 once, and then it helps you with every single check later. (You're right it can't be done on 32-bit platforms, and places where the signal handler trick can't be used for other reasons - but in wasm we've already agreed basically that the signal handler trick is something we can depend on in most important cases, which is why wasm didn't add a different form of sandboxing like masking.) |
Non-nullable types could be more efficient for ref.cast from GC proposal, and similar instructions that access struct RTT/header without trapping on null reference. In addition to the point 2. Even without being able to catch traps in Wasm, Wasm state can be observed after trap in a hosts like JS. Thus, unless they inver nullability, Wasm compilers wound not be able to reorder nullable |
I was wrong about ref.cast above. Adding trapping version of instructions would solve the problem. |
I'm not sure I subscribe to that. Even with bounds checking you cannot rely on signal handling everywhere. 32 bit platforms are still real, so are techniques like pointer compression, or execution environments with no access to signals. The knowledge that something isn't null also allows certain optimisations, e.g., in instructions following an explicit null check it can elide various cases where null is treated specially but without trapping, or when inlining a function performing an explicit null-check into a call site where the parameter is known to have non-null type. In simple cases, the engine could use flow analysis to derive similar information, but that does not work across function boundaries, except when inlining. |
I agree to all those. I think you might be responding to a point I didn't make, though, so apologies if I wasn't clear enough. My point is that we knew 32-bit platforms etc. exist, and we considered adding a form of linear memory sandboxing like masking that does work on them and that is cheaper than explicit bounds checks. But we decided that the signal handler trick was possible in enough places, and the overhead of explicit bounds checks small enough, that it wasn't worth it. It seems like the same thinking could apply to GC, unless we expect 32-bit platforms to be more important there, or the overhead larger, for some reason?
I agree with this too. It's related to @skuzmich 's point that
I think this is definitely worth measuring to see how important it is. As before, something similar happens with linear memory checks, where potentially trapping things like memory accesses and integer divides etc. cannot be reordered. I'm not aware of data showing that that is a significant issue, but it may be, and it may be different in GC. Note btw that the binaryen optimizer has an |
To add another dimension of concern, I'm worried about how much non-null reference types will be generated. Many languages do not reason about nullness. Many languages that do still use null for other reasons. One simple example is using |
@kripken we have more freedom optimizing non-nullable |
V8 has no plans to use signal handler tricks for null checks on any platform. (Reason: our I haven't measured what the performance impact of these checks is. To me, the interesting (and hard-to-answer) question is how much guaranteed non-null-ness optimizing compilers will be able to infer by themselves (it's easy to come up with scenarios where it's trivial, as well as others where an engine can't possibly prove it, and in between there's the possible-but-difficult area), and how long it will take for engines to actually implement such optimizations. |
Could that sentinel object be in its own OS page, and that page protected in the same way that linear memory is? |
@RossTate I wouldn't worry about Kotlin. I think most of your concerns regarding initialization nullability is based on JVM implementation, which implies separate compilation and a bytecode format without sound type nullability. Many language features (like when-expressions, primary contractor properties, etc.) encourage immediate variable initialization: ~95% of Kotlin/JS standard library local variables are immediately initialized. Since separate compilation is not very viable in Wasm for many reasons, we can generate non-nullable Wasm types for most immediately-initialized non-nullable fields in a single whole-program compilation. Even if I'm wrong and we wouldn't be able to generate a lot of non-nullable types on average, I would argue, it is still important to give programmers a way to micro-optimize their perf-critical code by choosing the "efficient" initialization scheme. I will share concrete numbers once non-nullable type generation is done.
Is it because of interop with nulls from other subsystem, like JS?
A good solution would probably look like an inter-procedural analysis (and maybe a higher-level language type information), which would be more viable in AOT optimizers rather than Wasm JITs. There are some optimizations, that are not representable in Wasm types, like multiple accessing the same object, where one access dominate others. Generating local as_non_null cast after first access can make things worse for non-optimizing wasm engine with signal handler trick, but it might be better for current V8 in some cases. |
I considered that idea before saying "I don't see that happening" :-) Also, V8 implementation aside, IIUC memory protection tricks can only handle
"It's the
Agreed, our current prototype would achieve slightly better performance in some cases if you did that, but I would generally recommend that we first build things (in both compilers and engines) as simple as we can, then profile real-world use cases, and then decide where we need more complexity in order to improve things. Chances are that the situations where an X-to-Wasm compiler can easily emit such local optimizations are the same cases where a Wasm engine can fairly easily make the inference; so if/when we decide that this is worth improving, we should coordinate on the "how". |
Thanks for the useful information, @jakobkummerow! It's very interesting because it permits a bunch of flexibility. In particular, you can get from and set to fields (up to some offset) of a null reference, so long as you check that the reference is null before you take any observable actions based on what you got. Plus branch hinting/prediction is pretty effective. I imagine y'all have thoroughly explored the tradeoffs for JS, and given that JS is not a null-safe language or particularly easy to reason about, I would expect that it would have been very well positioned to benefit from trapping versus testing. So maybe that's a sign that testing is in fact not particularly expensive. @skuzmich To clarify my perspective, I think there is potential value for non-null references in WebAssembly and that we should leave the door open for such an extension, but at the moment we have bigger concerns to address. For example, I am very concerned about your statement that WebAssembly is not currently viable for separate compilation. It will be easier to make the GC MVP support separate compilation if the MVP can—for now—assume all value types are defaultable. Non-null reference types are currently the only value type in any proposal that would break that simplifying assumption, and their advantages don't seem to merit the complexity—for now. So, for me, this issue is a matter of prioritization. |
I think it is very likely that other Wasm engines will implement null checks with trap handling because they are so common. Wizard isn't at the stage yet where that optimization matters (not JIT yet), but it already does effectively do this, using an implementation-language |
@RossTate thank you, I agree that simplicity is better than a small performance benefit for now. But some people here have concerns that performance benefits might actually be significant. Because at the end of the day we are trying to compete with JS as a compilation target, and performance is the biggest competitive advantage we can have. We need more measurements :) Also, this probably deserves a separate discussion, but separate compilation is a very hard issue to solve:
|
@jakobkummerow |
Thanks for all the comments so far, everyone! Overall it seems like there may be a performance benefit here. We will just need to measure it to be sure. And it doesn't sound like there are any non-performance benefits. If we don't get clear numbers on this being a useful speedup, then as @RossTate said, it may make sense to defer non-nullability for later (as long as we don't do anything to prevent it from happening later). I think that may have value since non-nullability does add some complexity here (which surprised me), in particular the new |
Hi all, I created a loop which repeatedly applies an operation requiring a null check on a local value. I measured the running time of the loop when the value was nullable vs. non-nullable. To isolate the effect of the operation from the loop jump and counter increment, I also ran an empty loop (which just increments a counter). Note that, since the operation was always run on the same non-null object, the branch prediction should have hit every time. The results might be different otherwise. Unfortunately, the benchmark is quite flaky, but these are the trends which I observed: These numbers are of course specific to the situation of a tight loop with a specific payload, and are not likely to be measured in an actual program. However, I think they do show that non-nullability is an effective feature. |
This is not to detract from the discussions here and about @skuzmich made a clever suggestion in #40 (comment). I reread the blog on pointer compression, and I think it should be compatible with that. I suspect the GC already sets up some space for permanent objects created during initialization, so my guess is that you could implement this by putting the |
It might be possible to create such a
Also, in general, I wouldn't want the specification to rely on engines implementing one particular optimization technique. It's good to know that certain patterns can be optimized through advanced hackery, but it's also good to assume/allow implementation flexibility: what's viable in one engine today might not be viable in another engine, or in a future with different constraints and/or fundamental design choices, or on other platforms (e.g. 32-bit). |
Sounds good! I think that the current spec, because it allows So, altogether, my sense is that the spec should not be changed assuming a faulting implementation of |
The idea is fine (and not new I think; ISTR Lisp implementations have used this for nil in the past) but it is not performance-neutral. Accesses will be faster, but tests will be slower. As the bit pattern for the null value is now something specific rather than all-zeroes, the null value has to be loaded (from Tls, say) for anything that compares with null. Frequently the Tls pointer is in a register, so this is often just a single load, and it can be commoned if the value is used frequently, but then it occupies a register. In contrast, compare-with-zero-and-branch is fast and compact on most systems. Additionally, as Jakob says, the JS null will be a different value; this means significant complexity at the JS/wasm boundary. This worries me more, actually. (Finally, the wasm null must be the same everywhere which means managing a shared resource somehow, but this seems like small potatoes.) |
Consensus is to include non-null types, so closing this. |
I understand the benefit of non-nullable references in a source language, but I was curious what we expect the benefits to be in wasm? Is it for speed, or correctness, or something else?
The one minor benefit I can think of is that a wasm=>wasm optimizer could optimize out some
ref.is_null
checks based on it.The overview mentions the overall benefits of the proposal, but none of those seem to apply to non-nullable references. I'm probably missing something obvious, though.
The text was updated successfully, but these errors were encountered: