Skip to content

Commit

Permalink
fix perplexity after c-api refactor (#390)
Browse files Browse the repository at this point in the history
* preallocate a buffer of fitting size for tokenization (utils.cpp)

* don't create a new std::string (especially here, where it's usually large)
  • Loading branch information
Green-Sky authored Mar 22, 2023
1 parent 40ea807 commit 56e659a
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 2 deletions.
2 changes: 1 addition & 1 deletion main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ void perplexity(llama_context * ctx, const gpt_params & params) {
// Download: https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-raw-v1.zip?ref=salesforce-research
// Run `./main --perplexity -m models/7B/ggml-model-q4_0.bin -f wiki.test.raw`
// Output: `perplexity: 13.5106 [114/114]`
auto tokens = ::llama_tokenize(ctx, params.prompt.c_str(), true);
auto tokens = ::llama_tokenize(ctx, params.prompt, true);

int count = 0;
double nll = 0.0;
Expand Down
4 changes: 3 additions & 1 deletion utils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -146,8 +146,10 @@ std::string gpt_random_prompt(std::mt19937 & rng) {

// TODO: not great allocating this every time
std::vector<llama_token> llama_tokenize(struct llama_context * ctx, const std::string & text, bool add_bos) {
std::vector<llama_token> res(8096);
// initialize to prompt numer of chars, since n_tokens <= n_prompt_chars
std::vector<llama_token> res(text.size() + (int)add_bos);
int n = llama_tokenize(ctx, text.c_str(), res.data(), res.size(), add_bos);
assert(n >= 0);
res.resize(n);

return res;
Expand Down

0 comments on commit 56e659a

Please sign in to comment.