Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated make-ggml.py compatibility with more models and GGUF #3290

Merged
merged 5 commits into from
Sep 27, 2023

Conversation

richardr1126
Copy link

I have updated the script to work with all the model types HF --> GGUF conversions.

There is a new flag --model_type takes as input (llama, starcoder, falcon, baichuan, or gptneox). I have not tested with all models but can confirm llama and starcoder quanitize correctly with k-quants, and the falcon model only works with the legacy quants.

@richardr1126 richardr1126 changed the title Updated make-ggml.py with compatibility with more models and GGUF Updated make-ggml.py compatibility with more models and GGUF Sep 21, 2023
@ggerganov ggerganov merged commit ac43576 into ggerganov:master Sep 27, 2023
joelkuiper added a commit to vortext/llama.cpp that referenced this pull request Sep 27, 2023
…example

* 'master' of github.com:ggerganov/llama.cpp:
  convert : remove bug in convert.py permute function (ggerganov#3364)
  make-ggml.py : compatibility with more models and GGUF (ggerganov#3290)
  gguf : fix a few general keys (ggerganov#3341)
  metal : reusing llama.cpp logging (ggerganov#3152)
  build : add ACCELERATE_NEW_LAPACK to fix warning on macOS Sonoma (ggerganov#3342)
  readme : add some recent perplexity and bpw measurements to READMES, link for k-quants (ggerganov#3340)
  cmake : fix build-info.h on MSVC (ggerganov#3309)
  docs: Fix typo CLBlast_DIR var. (ggerganov#3330)
  nix : add cuda, use a symlinked toolkit for cmake (ggerganov#3202)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants