-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Classifier Free Guidance #536
Classifier Free Guidance #536
Conversation
Thank you Martin. It is really a guidance and not a negative prompt. We could call it a positive prompt. You basically tell the model what to answer. This is not what we want because this is not as useful as having a negative prompt where we can prevent specific answers (for example for ethical filtering). I am not sure if llama.cpp has the support for a negative prompt or if we can turn this positive prompt (guidance) around to become a negative prompt. Maybe you have some ideas. |
Added a possible bug issue: ggerganov/llama.cpp#5709 |
Do you mean the master branch in llama.cpp? if so that's expected, there's no stability in the llama.cpp API, tweaks are required on the C# side every time we update the binaries we ship. The branch should work as-is though, without any modifications.
I'm not 100% sure what you mean, but did you try a negative guidance strength? That should work in the "opposite direction", if that's what you're looking for? Edit: Change the |
No, I took [martindevans:guidance_in_token_data_array] and there were a lot of issues like Span instead of ReadOnlySpan, some 'protected tokens' which I have removed, in the Native api, etc. If you have committed that version to here, then I am not sure how it passed all checks. Maybe I have downloaded the wrong version.
What we have now is a positive guidance (what usually goes into the normal prompt) instead of a negative one. You can see this if you run your example, if you include 'red' in your negative prompt, then 'red' will be in the output. This is not how a negative prompt should work. If you add 'red' to the negative prompt, then we do not expect 'red' in the output. |
…sifier free guidance
2f4a85b
to
879c62e
Compare
I've rebased this branch onto master now. I made some changes to the custom sampling pipeline (actually inspired by this PR) which broke things.
So rather than trying to steer the model away from a certain direction/topic, are you trying to make sure it doesn't mention that thing at all? ResultsJust demonstrating some more results, for the sake of discussion. Prompts:
Weight: 2Unguided: blue. Blue is calm, serene and peaceful. It's the colour of the sky and the ocean, two of my favourite things. The guidance ("hate red") has been negated to steer the model away from that direction (i.e. towards liking red). Weight: -2Unguided: brown. It's such a versatile colour that can be used in many different ways. It's warm, inviting and comforting, which makes Here the guidance weight has been inverted. It now hates red, just like the instructions in the guidance tell it to. Not sure how useful this is, since you could just put that instruction in the normal prompt. |
This sounds very promising! I think that you have just inverted the weight as we need it.
I think that you should change your example like this:
What we still need to figure out is how to make sure that some things are just not mentioned. For example, if I do not ever want to see 'red' in the output, how to do it. In your example 'red' is mentioned in the output even with weight -2. |
Banning a specific token can be achieved with a logit bias, for example setting a bias of |
I do not see any logit bias inference parameter. Could you please give an example on how to do this? Also, what is 1171? I have tested the code. A few remarks:
|
It here. If you wanted to ban the token "red" you would add This is a much more limited mechanism than CFG, as you say it can't ban words that aren't individual tokens. It could also cause it to become predisposed to reductively redacting redundant words ;)
Odd, I don't see this. I'm running with CPU inference though, are you running with CUDA? I just checked and I had missed a few places where resources were not being disposed properly in the examples, I just pushed up a commit fixing that. Hopefully that helps 🤞
If you look in the PR which originally implemented CFG ggerganov/llama.cpp#2135 you can see their examples do the same thing. This makes sense given how it works internally. All it's doing is generating 2 sets of token probabilities at once, but the guidance probabilities lower the chance a token is selected. This means you want both sequences talking about roughly the same thing, otherwise the guidance probabilities are unrelated and don't really do much (it'll probably just make already unlikely tokens less likely).
I'm not sure exactly what you mean. Do you mean this bit? // Use this token to advance both guided _and_ guidance. Keeping them in sync (except for the initial prompt).
guided.Prompt(g);
guidance.Prompt(g); If so, again this is due to the way CFG works. Both the |
Thank you Martin. It is clear now what is happening. We basically need to enter things we do not want to see in the output before the original prompt for the negative guidance (negative propt). The fusion of the logits will then make sure that those elements have less chance to appear in the output. Yes, I agree, this logit bias removal of tokens does not have any sense. |
Tested the code again. No crash this time, you have found the problem 👍 . |
That shouldn't be a huge problem. The guidance is based on the difference in probabilities between guided and guidance. So for example if you prompted the same thing in both sequences it would make no difference at all to the output because the probabilities would be identical in both sequences! In the example we're using here with favourite colours, both sequences will generate tokens (with high probability) to do with talking about colours in general (small difference, not much effect from guidance). However, the negative one will have a much higher probability for on tokens talking about hating red, so that will have a strong effect on output (pushing it away from that). This is why the negative sequence needs to contain the same prompt, it keeps the continuations as similar as possible and thus the differences in token weights are relevant. |
I think that it would be a good idea to enforce that the original propt is added to the end of the guidance prompt (negative propt). We should ask the guidance and then add them together internally. It does not have any sense to accidentally have a different original propt in the guidance propt... |
Yeah I agree. This is the "low level" interface (working directly with sampling, tokens logits etc). There definitely need to be a much higher level API over this in the future where you can just supply a positive and negative prompt and it internally does the right thing (e.g. joining them together, making sure that combo fits the template etc). |
Since the crashing issue has been fixed for you I'll merge this one. Thanks for your help testing it :) |
Implemented a demo of how to implement classifier free guidance (aka "negative prompts") using the batched executor.
See #535.