Enhancement: Dynamic Context Token Size for OpenRouter LLM #1698

dannykorpan · 2024-02-01T12:16:13Z

What is your question?

Hi,

I'm testing codellama/codellama-70b-instruct LLM from OpenRouter. It's limited to a context length of 2,048 tokens. But it's capable for unlimited context length.

My questions are:

How do I set longer context length. e.g. 128k.
How do I disable the [middle-out transform](middle-out transform)?

More Details

Link to LLM on OpenRouter: https://openrouter.ai/models/codellama/codellama-70b-instruct

Thank you for your help in advance,
Danny

What is the main subject of your question?

No response

Screenshots

Code of Conduct

I agree to follow this project's Code of Conduct

danny-avila · 2024-02-01T15:20:34Z

You can't, but OpenRouter does return context token size when fetching models. I've been thinking on how to use this and will have an update on this soon

danny-avila · 2024-02-01T20:50:43Z

So I couldn't find a way to "disable" middle-out in their docs, but if you do "infinite" context, then it implies middle-out is 'activated', so my assumption is that "disabling" is abiding to the 2048 context of that model.

My upcoming update will use their rates, which may not be satisfactory to you for this particular model since it only has 2048 context. I think this is fine though, since they are discarding your tokens anyway (it just won't appear that way)

dannykorpan · 2024-02-02T08:06:48Z

Hi, I've found this in the documentation. You have the disable transforms: ["middle-out"] to transforms: [""]?

middle-out: compress prompts and message chains to the context size. This helps users extend conversations in part because [LLMs pay significantly less attention](https://twitter.com/xanderatallah/status/1678511019834896386)

    to the middle of sequences anyway. Works by compressing or removing messages in the middle of the prompt.

Note: [All OpenRouter models](https://openrouter.ai/models) default to using middle-out, unless you exclude this transform by e.g. setting transforms: [] in the request body.

https://openrouter.ai/docs#errors

danny-avila · 2024-02-02T08:24:41Z

LibreChat is already doing what OpenRouter would do if you exceed the context though.

it's capable for unlimited context length.

this is not really true. They just discard your tokens with the “middle-out” strategy if you go over. Disabling it would run you into context length error. LibreChat discards tokens that would exceed the context, starting from the tail end of the conversation

dannykorpan added the ❓ question label Feb 1, 2024

danny-avila changed the title ~~[Question]: How do I set max_token size for OpenRouter LLM~~ Enhancement: Dynamic Context Token Size for OpenRouter LLM Feb 1, 2024

danny-avila added ✨ enhancement and removed ❓ question labels Feb 1, 2024

danny-avila mentioned this issue Feb 2, 2024

🪙 feat: Use OpenRouter Model Data for Token Cost and Context #1703

Merged

9 tasks

danny-avila closed this as completed in #1703 Feb 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: Dynamic Context Token Size for OpenRouter LLM #1698

Enhancement: Dynamic Context Token Size for OpenRouter LLM #1698

dannykorpan commented Feb 1, 2024

danny-avila commented Feb 1, 2024

danny-avila commented Feb 1, 2024

dannykorpan commented Feb 2, 2024

danny-avila commented Feb 2, 2024

Enhancement: Dynamic Context Token Size for OpenRouter LLM #1698

Enhancement: Dynamic Context Token Size for OpenRouter LLM #1698

Comments

dannykorpan commented Feb 1, 2024

What is your question?

More Details

What is the main subject of your question?

Screenshots

Code of Conduct

danny-avila commented Feb 1, 2024

danny-avila commented Feb 1, 2024

dannykorpan commented Feb 2, 2024

danny-avila commented Feb 2, 2024