Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : skip register metal backend on os simulator #10132

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jhen0409
Copy link
Collaborator

@jhen0409 jhen0409 commented Nov 2, 2024

Fix for #10089.

jhen0409 added a commit to a-ghorbani/llama.rn that referenced this pull request Nov 2, 2024
jhen0409 added a commit to mybigday/llama.rn that referenced this pull request Nov 2, 2024
* feat: sync llama.cpp

* fix: fix submodule update - as part of llama.cpp sync

* chore: remove unnecessary comment

* chore(example): revert unnecessary changes

* feat: sync llama.cpp

* fix: remove tfs_z

ref: ggerganov/llama.cpp#10071

* fix(cpp): skip gpu device if n_gpu_layers <= 0

ref: ggerganov/llama.cpp#10132

---------

Co-authored-by: Jhen-Jie Hong <[email protected]>
Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good solution - we should avoid special-casing backends like this from now on. In theory, even with ngl == 0, a backend can still be utilized to offload very heavy compute ops (for example, like we do with large batches with the CUDA backend).

@jhen0409
Copy link
Collaborator Author

jhen0409 commented Nov 2, 2024

This is not a good solution - we should avoid special-casing backends like this from now on. In theory, even with ngl == 0, a backend can still be utilized to offload very heavy compute ops (for example, like we do with large batches with the CUDA backend).

Got it. Maybe just skipping the metal backend registration in the simulator will be enough to fix this issue.

For disable device, it looks like the todo will be a better way.

@jhen0409 jhen0409 force-pushed the fix-ios-disable-gpu branch from 7ef6580 to cd457dc Compare November 2, 2024 10:32
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Nov 2, 2024
@jhen0409 jhen0409 changed the title llama : skip metal device if n_gpu_layers <= 0 ggml : skip register metal backend on os simulator Nov 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants