Implement context shifting in executor base #714

ksanman · 2024-05-02T04:54:12Z

Based on the discussion in #713

This will allow the Interactive/Instruct to avoid NoKvSlot errors when the chat history/prompt is longer than the model context.

This is the same logic from the stateless executor and llama.cpp main example

martindevans · 2024-05-02T22:43:29Z

There's some interesting differences between the llama.cpp example code you linked, and the implementation in your code (and the stateless executor).

Stateless:

llama_kv_cache_seq_rm(ctx,  0, inferenceParams.TokensKeep + 1, inferenceParams.TokensKeep + n_discard + 1);
llama_kv_cache_seq_add(ctx, 0, inferenceParams.TokensKeep + 1 + n_discard, n_past, -n_discard);

llama.cpp example:

llama_kv_cache_seq_rm (ctx, 0, params.n_keep            , params.n_keep + n_discard);
llama_kv_cache_seq_add(ctx, 0, params.n_keep + n_discard, n_past, -n_discard);

Some weird +1 bits in there.

Digging through the history in llama.cpp it looks like the example code used to look like that until this PR.

Would you be up for modifying it to work like the new example code, testing that, and if it works also porting that change back across to the stateless executor?

ksanman · 2024-05-03T05:36:34Z

I have added a method to add +1 to tokensKeep if the BOS token is available and refactored the executors to be inline with the example code.

The models I have are working fine, but I need to test some other models that do not have BOS (like Falcon) to see if they work.

martindevans · 2024-05-03T14:20:01Z

Looks good so far! Thanks for doing the extra work.

SignalRT · 2024-05-04T09:01:46Z

It seems ok to me.

ksanman · 2024-05-04T20:53:41Z

I tested all the supported models, and they are working correctly.

martindevans · 2024-05-04T21:17:11Z

Excellent, thanks for the hard work @ksanman!

Implement context shifting in executor base

61d143d

ksanman added 2 commits May 2, 2024 23:29

Add method to get BOS token.

46a9d60

Refactor executors

0bbbf17

martindevans merged commit 9906871 into SciSharp:master May 4, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement context shifting in executor base #714

Implement context shifting in executor base #714

ksanman commented May 2, 2024

martindevans commented May 2, 2024

ksanman commented May 3, 2024

martindevans commented May 3, 2024

SignalRT commented May 4, 2024

ksanman commented May 4, 2024

martindevans commented May 4, 2024

Implement context shifting in executor base #714

Implement context shifting in executor base #714

Conversation

ksanman commented May 2, 2024

martindevans commented May 2, 2024

ksanman commented May 3, 2024

martindevans commented May 3, 2024

SignalRT commented May 4, 2024

ksanman commented May 4, 2024

martindevans commented May 4, 2024