Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spin 1.5.0 fails to build on aarch64+musl with cross #1786

Closed
jprendes opened this issue Sep 20, 2023 · 4 comments · Fixed by #1802
Closed

Spin 1.5.0 fails to build on aarch64+musl with cross #1786

jprendes opened this issue Sep 20, 2023 · 4 comments · Fixed by #1802

Comments

@jprendes
Copy link

I am trying to update spin from 1.4.1 to 1.5.0 in deislabs/containerd-wasm-shims#151
spin 1.5.0 introduced llm support through the ggml crate, which itself depends on llama.cpp.
llama.cpp uses the vld1q_u8_x4 and vld1q_s8_x4 compiler intrinsics, which are missing in gcc < 10.3 for aarch64 (see xmrig/xmrig#2882). There was a similar fix for android only ggerganov/llama.cpp#2920.
Unfortunately cross uses gcc 9 for aarch+musl, resulting in a compilation failure.

It would be great if there was a feature flag to turn llm off.

@itowlson
Copy link
Contributor

Ideas for options:

  1. Feature out LLM as you suggest. These builds of Spin no longer provide the llm Wasm interface.
  • I believe this would cause JavaScript and Python components built with the 1.5 SDK to fail to load, with a link error. It's not a problem for Rust or Go, because dead code elimination removes the unused imports, but the JavaScript and Python SDKs have to import everything. You get a fragmentation of the Spin "world" where the component contract is different from place to place.
  • It might even cause problems for Spin, as it would no longer implement the fermyon:spin/reactor world. Would we have to create a reactor-sans-llm world? I'm not sure. cc @dicej @alexcrichton for wisdom on this.
  1. Provide a way for hosts to offer an alternate implementation of LLM. Hosts that don't care about LLM could provide an implementation that always returns an error.
  • Fermyon Cloud already provides an alternate implementation, sending the work to GPUs. Admittedly FC interacts with the runtime at a much lower layer than most hosts, but I know there is interest in making the implementation more replaceable anyway even in the "standard" host (e.g. in support of https://github.com/fermyon/spin-cloud-gpu).
  • The danger now is invisible fragmentation of the Spin "world." Components that rely on an API move to a new deployment environment and suddenly it's not there. Under option 1 there is at least a chance for the component and host could detect the mismatch statically, during deployment validation. That said, we already have an element of this in that (say) inferencing on Spin has very different usability from inferencing on Fermyon Cloud - the two are definitely not interchangeable deployment environments!
  1. Try to upstream a fix to cross or llama.
  • Out of our control, and could be whack-a-mole if Spin adds other engines.
  1. Something else.

@alexcrichton
Copy link
Contributor

I don't have the whole context here, but in isolation one other possible route would be to have a feature which disables llm but it's still provided as part of the world functions, meaning JS/Python would continue to work. If invoked at runtime, however, it would return errors rather than success along the lines of "support for llm was not enabled at compile time" or similar.

@itowlson
Copy link
Contributor

@jprendes As a temporary measure you could patch out the spin-llm crate using the Cargo patch feature. It's a bit brittle (for large values of "a bit", I admit) but could serve as a workaround until we figure out something more permanent. Example in the most recent commit on https://github.com/itowlson/lepton/commits/stub-out-llm

@jprendes
Copy link
Author

Thanks, for the time being I workaround this by polyfilling the missing intrinsics in arm_neon.h in the cross container, but that's not a good long term solution.

@michelleN michelleN moved this from 🆕 Triage Needed to 📋 Investigating / Open for Comment in Spin Triage Sep 22, 2023
@michelleN michelleN moved this from 📋 Investigating / Open for Comment to 🔖 Backlog in Spin Triage Sep 25, 2023
@github-project-automation github-project-automation bot moved this from 🔖 Backlog to ✅ Done in Spin Triage Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants