Remove "first token must be BOS" restriction #2153

oobabooga · 2023-07-09T03:00:32Z

Currently, the llama_eval_internal function requires the first token in the tokens array to be a BOS token (=1).

I believe that this is not necessary, as

Intentionally removing the BOS token can make generations more creative. With the BOS token, the prompt is associated to text at the beginning of a new document in the training dataset. Without it, the prompt can be associated to text at any location.

In other words, the BOS token adds a "beginning of document" bias that can be optionally removed.

This is not desirable while evaluating the model perplexity, since in most cases the sequence of ids will be mid-document.

I originally encountered the "first token must be BOS" error while trying to evaluate llama.cpp using a transformers wrapper that I am working on here. The evaluation fails because the first token in the sequence provided by my code, which is based on this tutorial, is not BOS.

ggerganov

I added this check since the initial batch of OpenLLaMA models suffered significantly if the first token was not BOS.
I haven't checked what happened later - have some vague memory that this issue was resolved with later releases, so it is probably not needed any more.

Remove "first token must be BOS" restriction

3d98717

ggerganov approved these changes Jul 9, 2023

View reviewed changes

ggerganov merged commit 1d16309 into ggml-org:master Jul 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove "first token must be BOS" restriction #2153

Remove "first token must be BOS" restriction #2153

oobabooga commented Jul 9, 2023

ggerganov left a comment

Remove "first token must be BOS" restriction #2153

Remove "first token must be BOS" restriction #2153

Conversation

oobabooga commented Jul 9, 2023

ggerganov left a comment

Choose a reason for hiding this comment