Skip to content
This repository has been archived by the owner on Jan 26, 2022. It is now read-only.

SIMD.js on non-SSE devices #317

Closed
nmostafa opened this issue Feb 4, 2016 · 8 comments
Closed

SIMD.js on non-SSE devices #317

nmostafa opened this issue Feb 4, 2016 · 8 comments

Comments

@nmostafa
Copy link
Contributor

nmostafa commented Feb 4, 2016

We had the agreement to have SIMD functionality always enabled regardless of whether the implementation is optimized or not. This also seems to be the opinion of TC39.

This poses a problem on non-SSE devices. In Chakra, the unoptimized implementation (runtime library) uses SSE2 intrinsics. This was a design choice to guarantee identical semantics to JIT'ed code (no change in precision, rounding .. etc, when transitioning from interpreter to JIT and vice-versa). If SIMD is enabled without SSE2, the runtime library will not function, and hence the problem.

One solution is to have a non-SSE sequential implementation for each operation as a fall-back code if SSE2 is not available, which will be OK since JITing of SIMD ops will be disabled on such platforms. This obviously a large amount of work in development, maintaining and testing, to be part of the runtime. Another solution, is to use the JS polyfill on those platforms, but I am not sure if that would be spec-compliant.

Or should we re-consider making SIMD.js optional (based on platform features, optimized implementation) ? Thoughts ?

@johnmccutchan
Copy link
Collaborator

This poses a problem on non-SSE devices. In Chakra, the unoptimized implementation (runtime library) uses SSE2 intrinsics. This was a design choice to guarantee identical semantics to JIT'ed code (no change in precision, rounding .. etc, when transitioning from interpreter to JIT and vice-versa). If SIMD is enabled without SSE2, the runtime library will not function, and hence the problem.

SSE2 has been in x86 chips for ~15 years starting with the Pentium 4. Windows 8 and above require a CPU that has support for NX Bit. That was included in the Pentium 4. So, how can someone run Chakra on a chip without SSE2?

Or should we re-consider making SIMD.js optional (based on platform features, optimized implementation) ? Thoughts ?

No, we should not re-consider making SIMD.js optional.

@littledan
Copy link
Member

Two points:

  • The SIMD.js spec is already well-defined without executing on actual SIMD hardware, and V8 has successfully been going about its implementation by starting with a C++ implementation rather than SSE for its baseline cross-platform implementation
  • TC39 can't stop companies from shipping non-spec-compliant JavaScript implementations without full functionality

@nmostafa
Copy link
Contributor Author

nmostafa commented Feb 5, 2016

@johnmccutchan

SSE2 has been in x86 chips for ~15 years starting with the Pentium 4. Windows 8 and above require a CPU that has support for NX Bit. That was included in the Pentium 4. So, how can someone run Chakra on a chip without SSE2?

True, but I am talking about low-end IoT devices (e.g. Intel Quark). SIMD.js for such devices would require C++ or x87-based code-gen implementation. And it seems pointless to me to allow the feature where there is no SIMD ISA to start with, and no performance benefit.

@littledan, good points ..

The SIMD.js spec is already well-defined without executing on actual SIMD hardware, and V8 has successfully been going about its implementation by starting with a C++ implementation rather than SSE for its baseline cross-platform implementation

Do you know if this implementation is used along with the optimized one ? Do you bail out from optimized SSE code to C++ implementation ?

TC39 can't stop companies from shipping non-spec-compliant JavaScript implementations without full functionality.

Good point. But it seems that being spec-compliant is a goal of IoT JS engines (see JerryScript), and like you mentioned, we would need at least C++ implementation for these platforms.

@jfbastien
Copy link

I don't understand the proposal: what would JS code using SIMD.js do if an implementation were to not implement the SIMD.js option? Simply not work?

How is that different from current SIMD.js (which isn't optional), where an engine decides to diverge by not implementing SIMD.js? From a user's perspective it still doesn't work.

Users lose in both cases! I don't get the advantage of saying that something is optional. What am I missing?

@nmostafa
Copy link
Contributor Author

nmostafa commented Feb 5, 2016

I don't understand the proposal: what would JS code using SIMD.js do if an implementation were to not implement the SIMD.js option? Simply not work?

I imagine the JS code would have a fall-back sequential version of the vectorized kernel. JS code might look like this:

if (SIMD !== undefined)
{ /* vectorized version */ }
else
{ /* sequential version */ }

One would expect that the vectorized path should always be faster. However, if SIMD is always defined, regardless of performance, we may end up with vectorized path being slower. That's because a generic baseline implementation requires a call to the runtime for each SIMD op, while the sequential code can be type-specialized. So by making SIMD optional, we reflect implementation status.

Another advantage is not investing in a generic C++ implementation that users don't really care about.

@johnmccutchan
Copy link
Collaborator

@nmostafa TC-39 has already decided that SIMD.js is not going to be made optional. This won't be relitigated. I suggest that the Chakra team follow the V8 team in developing a generic C++ (or JavaScript) implementation as the fallback.

@littledan
Copy link
Member

For more general background, TC39 has repeatedly rejected the idea of making a more embedded, IoT-friendly profile, though it has been proposed several times. I gave Apple space to bring up this topic again at the January meeting, and it was again rejected.

@nmostafa
Copy link
Contributor Author

Thanks for the clarification, @littledan.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants