You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The initilization of llama_batch::seq_id in simple.cpp seems suspect - but I'm not nearly knowlegeable about what seq_id should be to fix it.
llama_batch batch = llama_batch_init(512, 0, 1);
// evaluate the initial prompt
batch.n_tokens = tokens_list.size();
for (int32_t i = 0; i < batch.n_tokens; i++) {
batch.token[i] = tokens_list[i];
batch.pos[i] = i;
batch.seq_id[i] = 0;
batch.logits[i] = false;
}
// llama_decode will output logits only for the last token of the prompt
batch.logits[batch.n_tokens - 1] = true;
Time permitting I may take a stab at porting whatever seems to be working for main over.
The text was updated successfully, but these errors were encountered:
Expected Behavior
./simple.cpp
with TheBloke'sLlama-2-7b-Chat-GGUF
should run without issue.Current Behavior
./simple ~/.cache/huggingface/hub/models--TheBloke--Llama-2-7b-Chat-GGUF/blobs/08a5566d61d7cb6b420c3e4387a39e0078e1f2fe5f055f3a03887385304d4bfa
(https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF)
results in
Hello my name isSegmentation fault (core dumped)
The model works fine with
main
.I'm running ubuntu latest with everything up to date. compiled with
make
(no cuda, etc.).The line that fails is
The initilization of
llama_batch::seq_id
insimple.cpp
seems suspect - but I'm not nearly knowlegeable about whatseq_id
should be to fix it.Time permitting I may take a stab at porting whatever seems to be working for
main
over.The text was updated successfully, but these errors were encountered: