-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(option): to use different samplers #20
Comments
Seems to be working as before. New sampling parameter defaults in llama.cpp are: tfs_z = 1.0f;
typical_p = 1.0f;
frequency_penalty = 0.0f;
presence_penalty = 0.0f;
mirostat = 0; // 0 disabled, 1 for v1, 2 for v2.
mirostat_tau = 5.0f;
mirostat_eta = 0.1f; Might as well add those in and try. |
These new ones are disabled by default. Issue #20
Hrm... for mirostat, it looks like we need to remember a |
This will make it easier to maintain other state variables. Issue #20
ggml-org/llama.cpp#1126 introduced some new ones. Right now, we use repetition penalty. It does a decent job of avoiding repeated content for a while, but it's certainly not perfect. For example, a large window penalizes a lot of punctuation and causes run-on sentences. We can already change this by excluding tokens from the penalized list, but it's a balancing act that I'm not very good at.
First order of business is to get repeat_penalty working with the new API.
The text was updated successfully, but these errors were encountered: