You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm building a llava application. When the amount of tokens in my initial prompt is bigger than the batch size, the InteractiveExecutor will throw a:
System.ArgumentException: Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source collection.
at System.Collections.Generic.List`1.GetRange(Int32 index, Int32 count)
at LLama.InteractiveExecutor.InferInternal(IInferenceParams inferenceParams, InferStateArgs args) in C:\RiderProjects\llava_defender\LLama\LLamaInteractExecutor.cs:line 257
at LLama.StatefulExecutorBase.InferAsync(String text, IInferenceParams inferenceParams, CancellationToken cancellationToken)+MoveNext() in C:\RiderProjects\llava_defender\LLama\LLamaExecutorBase.cs:line 325
When adding a breakpoint to LLamaInteractExecutor line 257, we can observe the following:
My initial prompt is 1067 tokens (I have tokenized it and counted it) and the embed image is at position 1055 (which is somewhere at the end of my prompt), but _embeds only goes to 512 (the batch size).
Reproduction Steps
Use the default batch size (512)
Use a prompt of 513 tokens
Environment & Configuration
Operating system: Windows 10
.NET runtime version: 8.0
LLamaSharp version: current master
CUDA version (if you are using cuda backend): 12
CPU & GPU device: 7700K + RTX 3080
Known Workarounds
Increase batch size to (length of initial prompt + 1)
The text was updated successfully, but these errors were encountered:
Since #761 the BatchedExecutor will automatically split work up into multiple batches (so any size prompt can be handled, you just need to call Infer() enough times to process the entire queue of work) and since #770BatchedExecutor has had LLava support.
Description
I'm building a llava application. When the amount of tokens in my initial prompt is bigger than the batch size, the
InteractiveExecutor
will throw a:When adding a breakpoint to
LLamaInteractExecutor
line 257, we can observe the following:My initial prompt is 1067 tokens (I have tokenized it and counted it) and the embed image is at position 1055 (which is somewhere at the end of my prompt), but
_embeds
only goes to 512 (the batch size).Reproduction Steps
Environment & Configuration
Known Workarounds
The text was updated successfully, but these errors were encountered: