-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for loading out-of-tree backends #1058
Comments
I do not intend to add this functionality, but contributions to do this would be welcome. Backends can use all the ggml API in ggml-base, which excludes the backend registry and other backends. |
As a first step I posted a patch for setting the backend name with env var. The next steps would be to add all backend APIs to public headers and enforce versions when loading. |
I am not sure what you mean by this. All the APIs that need to be public are already in public headers. Backends can use the extended internal API, but these functions should not be exposed in the public API. The API version is already enforced when loading a backend. |
|
What I mean by "public API" is the API that end-users are supposed to use, i.e. the header files in the |
OK, then what I mean is the "backend API" which defines the interface which ggml backend needs to implement. It seems that the "backend API" is different from the "public API" which is fine. What I propose is to have public headers for the backend API and version support at some point. |
I am just not sure what this change would entail. To build a backend you need access to a ggml source tree, and then you need to include the files that you need from the |
Not necessarily. For example the RPC backend doesn't need this and our usecase is very similar to the RPC backend. |
We are developing a proprietary ggml backend and we won't be able to open-source it any time soon (if at all). We still want to be able to load our backend in runtime and use it for all of the
llama
binaries inllama.cpp
. Unfortunately, this is not possible at the moment because only a pre-defined set of in-tree backends are being loaded. A quick workaround for this is the following patch which we use as stop-gap solution:Are there any plans for adding support for dynamic loading of out-of-tree ggml backends?
I see the API is versioned with
GGML_BACKEND_API_VERSION
but it seems there is no clear separation of the header definitions needed for 3rd party backend implementations. Right now I have to includeinclude/ggml.h
,include/ggml-alloc.h
,include/ggml-backend.h
andsrc/ggml-backend-impl.h
to build our backend.We have been also implementing a PJRT plugin where they provide header definitions and API versioning. I think we can do the same for ggml.
The text was updated successfully, but these errors were encountered: