-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] gorilla-openfunctions-v1-q4f16_1-MLC crashes on JIT lib build on cuda12.2 #2113
Comments
Thank you @Sing-Li for reporting! That is because the I just updated the |
Thank you @MasterJH5574 It works fine now. Closing the issue. |
Sorry, @MasterJH5574 Is it possible to update the configs for the other two gorilla function weights as well 🙏 https://huggingface.co/mlc-ai/gorilla-openfunctions-v2-q4f32_1-MLC https://huggingface.co/mlc-ai/gorilla-openfunctions-v2-q4f16_1-MLC |
Hey @Sing-Li, sorry for the late reply. Just updated these two repositories. If I remember correctly, there might still be some output formatting issue for the function calling of gorilla v2. Could you try a bit at your convenience and see how it goes? |
Thanks @MasterJH5574 Test results:
gorilla-openfunctions-v2-q4f16_1
running
|
Thank you @Sing-Li for checking again. This issue #2121 (comment) also reports the similar error. We will look into that. |
🐛 Bug
Trying to serve gorilla openfunctions v1 will crash during initial jit library build. Same happens with openfunctions v2 and f16 or f32
To Reproduce
Steps to reproduce the behavior:
mlc_llm serve HF://mlc-ai/gorilla-openfunctions-v1-q4f16_1-MLC
Expected behavior
Should work as with Llama and Mistral and Gemma.
Environment
conda
, source): nightlypip
, source): nightly prevuiltpython -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))"
, applicable if you compile models):Additional context
The text was updated successfully, but these errors were encountered: