Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: Dynamic Context Token Size for OpenRouter LLM #1698

Closed
1 task done
dannykorpan opened this issue Feb 1, 2024 · 4 comments · Fixed by #1703
Closed
1 task done

Enhancement: Dynamic Context Token Size for OpenRouter LLM #1698

dannykorpan opened this issue Feb 1, 2024 · 4 comments · Fixed by #1703
Labels
✨ enhancement New feature or request

Comments

@dannykorpan
Copy link

What is your question?

Hi,

I'm testing codellama/codellama-70b-instruct LLM from OpenRouter. It's limited to a context length of 2,048 tokens. But it's capable for unlimited context length.

My questions are:

  1. How do I set longer context length. e.g. 128k.
  2. How do I disable the [middle-out transform](middle-out transform)?

More Details

Link to LLM on OpenRouter: https://openrouter.ai/models/codellama/codellama-70b-instruct

Thank you for your help in advance,
Danny

What is the main subject of your question?

No response

Screenshots

grafik

Code of Conduct

  • I agree to follow this project's Code of Conduct
@dannykorpan dannykorpan added the ❓ question Further information is requested label Feb 1, 2024
@danny-avila danny-avila changed the title [Question]: How do I set max_token size for OpenRouter LLM Enhancement: Dynamic Context Token Size for OpenRouter LLM Feb 1, 2024
@danny-avila
Copy link
Owner

You can't, but OpenRouter does return context token size when fetching models. I've been thinking on how to use this and will have an update on this soon

@danny-avila danny-avila added ✨ enhancement New feature or request and removed ❓ question Further information is requested labels Feb 1, 2024
@danny-avila
Copy link
Owner

301499842-f33e24f9-bfe3-4910-92af-dc658cf1b4d2

So I couldn't find a way to "disable" middle-out in their docs, but if you do "infinite" context, then it implies middle-out is 'activated', so my assumption is that "disabling" is abiding to the 2048 context of that model.

My upcoming update will use their rates, which may not be satisfactory to you for this particular model since it only has 2048 context. I think this is fine though, since they are discarding your tokens anyway (it just won't appear that way)

@dannykorpan
Copy link
Author

Hi, I've found this in the documentation. You have the disable transforms: ["middle-out"] to transforms: [""]?

middle-out: compress prompts and message chains to the context size. This helps users extend conversations in part because [LLMs pay significantly less attention](https://twitter.com/xanderatallah/status/1678511019834896386)

    to the middle of sequences anyway. Works by compressing or removing messages in the middle of the prompt.

Note: [All OpenRouter models](https://openrouter.ai/models) default to using middle-out, unless you exclude this transform by e.g. setting transforms: [] in the request body.
grafik

https://openrouter.ai/docs#errors

@danny-avila
Copy link
Owner

LibreChat is already doing what OpenRouter would do if you exceed the context though.

it's capable for unlimited context length.

this is not really true. They just discard your tokens with the “middle-out” strategy if you go over. Disabling it would run you into context length error. LibreChat discards tokens that would exceed the context, starting from the tail end of the conversation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants