Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for loading out-of-tree backends #1058

Open
rgerganov opened this issue Jan 2, 2025 · 8 comments
Open

Add support for loading out-of-tree backends #1058

rgerganov opened this issue Jan 2, 2025 · 8 comments
Assignees

Comments

@rgerganov
Copy link
Collaborator

We are developing a proprietary ggml backend and we won't be able to open-source it any time soon (if at all). We still want to be able to load our backend in runtime and use it for all of the llama binaries in llama.cpp. Unfortunately, this is not possible at the moment because only a pre-defined set of in-tree backends are being loaded. A quick workaround for this is the following patch which we use as stop-gap solution:

@@ -537,6 +537,11 @@ void ggml_backend_load_all_from_path(const char * dir_path) {
     bool silent = false;
 #endif
 
+    // check the environment variable GGML_BACKEND to load an out-of-tree backend
+    const char * env_backend = std::getenv("GGML_BACKEND");
+    if (env_backend) {
+        ggml_backend_load_best(env_backend, silent, dir_path);
+    }
     ggml_backend_load_best("blas", silent, dir_path);
     ggml_backend_load_best("cann", silent, dir_path);
     ggml_backend_load_best("cuda", silent, dir_path);

Are there any plans for adding support for dynamic loading of out-of-tree ggml backends?

I see the API is versioned with GGML_BACKEND_API_VERSION but it seems there is no clear separation of the header definitions needed for 3rd party backend implementations. Right now I have to include include/ggml.h, include/ggml-alloc.h, include/ggml-backend.h and src/ggml-backend-impl.h to build our backend.

We have been also implementing a PJRT plugin where they provide header definitions and API versioning. I think we can do the same for ggml.

@slaren
Copy link
Collaborator

slaren commented Jan 2, 2025

I do not intend to add this functionality, but contributions to do this would be welcome.

Backends can use all the ggml API in ggml-base, which excludes the backend registry and other backends. GGML_BACKEND_API_VERSION should be bumped when there are incompatible changes to the ggml-base ABI, but I cannot guarantee however that this will be done reliably until we start versioning ggml.

@rgerganov
Copy link
Collaborator Author

As a first step I posted a patch for setting the backend name with env var. The next steps would be to add all backend APIs to public headers and enforce versions when loading.

@slaren
Copy link
Collaborator

slaren commented Jan 3, 2025

The next steps would be to add all backend APIs to public headers and enforce versions when loading.

I am not sure what you mean by this. All the APIs that need to be public are already in public headers. Backends can use the extended internal API, but these functions should not be exposed in the public API. The API version is already enforced when loading a backend.

@rgerganov
Copy link
Collaborator Author

ggml_backend_init() and ggml_backend_score() are defined in ggml-backend-impl.h which, if I understand correctly, is an internal header and not part of the public API. If I need to create a shared library which exports these two functions, I have to include ggml-backend-impl.h. Am I missing something?

@slaren
Copy link
Collaborator

slaren commented Jan 3, 2025

What I mean by "public API" is the API that end-users are supposed to use, i.e. the header files in the include directory. Backends can also use the internal API by including the headers in the src directory. In fact, the build script will automatically add the src directory to the include directories of backends. If you are using an external build script, that's on your own to replicate this behavior.

@rgerganov
Copy link
Collaborator Author

OK, then what I mean is the "backend API" which defines the interface which ggml backend needs to implement. It seems that the "backend API" is different from the "public API" which is fine.

What I propose is to have public headers for the backend API and version support at some point.

rgerganov added a commit to rgerganov/ggml that referenced this issue Jan 3, 2025
@slaren
Copy link
Collaborator

slaren commented Jan 3, 2025

What I propose is to have public headers for the backend API and version support at some point.

I am just not sure what this change would entail. To build a backend you need access to a ggml source tree, and then you need to include the files that you need from the src directory. I don't think that needs to be changed. There is already version checking in the api_version field of ggml_backend_reg, however it is not enough to bump this when the interface structures in ggml-backend-impl.h change, it is also necessary to bump this when the ggml structures and functions that it depends on change, since that's part of the backend ABI as well.

@rgerganov
Copy link
Collaborator Author

I am just not sure what this change would entail. To build a backend you need access to a ggml source tree, and then you need to include the files that you need from the src directory.

Not necessarily. For example the RPC backend doesn't need this and our usecase is very similar to the RPC backend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants