-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement non-greedy tokenizer that tries to maximize token lengths #242
Conversation
Although I haven't examined the code, I've tested it on several prompts and can already conclude that this patch allows Llama to write in French. |
e40d4e0
to
3838e51
Compare
@@ -846,6 +846,7 @@ int main(int argc, char ** argv) { | |||
std::vector<float> logits; | |||
|
|||
// tokenize the prompt | |||
params.prompt.insert(0, 1, ' '); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the space meant to be a separate token? I noticed that it often get fused with the first user provided token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be fused to the first token! This is how original python llama code parses it.
I can dig out more details if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge it if results look ok.
I won't be able to have detailed look in the next few days
- this is to match original llama tokenizer behavior
3838e51
to
7566d1a
Compare
…gerganov#242) * Implement non-greedy tokenizer that tries to maximize token lengths * Insert single space in front of the prompt - this is to match original llama tokenizer behavior --------- Co-authored-by: Jakub Horak <[email protected]>
…gerganov#242) * Implement non-greedy tokenizer that tries to maximize token lengths * Insert single space in front of the prompt - this is to match original llama tokenizer behavior --------- Co-authored-by: Jakub Horak <[email protected]>
…gerganov#242) * Implement non-greedy tokenizer that tries to maximize token lengths * Insert single space in front of the prompt - this is to match original llama tokenizer behavior --------- Co-authored-by: Jakub Horak <[email protected]>
No description provided.