OpenAI o1 models require `max_completion_tokens` instead of `max_tokens` #724

archer-eric · 2025-01-28T05:59:18Z

Problem

OpenAI o1 models require max_completion_tokens instead of max_tokens.

When the llm command is used with an OpenAI o1 series model like o1-preview, the -o max_tokens N option returns an error from OpenAI that max_completion_tokens should be used instead.

When -o max_completion_tokens N is used, llm generates an error instead of passing it to the OpenAI API.

Model Provider Documentation

The OpenAI docs explain that max_tokens is deprecated and is already not compatible with the o1 models:
https://platform.openai.com/docs/api-reference/chat/create

max_tokens Deprecated integer or null

Optional
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

This value is now deprecated in favor of max_completion_tokens, and is not compatible with o1 series models.

max_completion_tokens integer or null

Optional
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

Examples

These commands demonstrate the problem:

$ LLM_OPENAI_SHOW_RESPONSES=1 llm --no-stream -m o1-preview -o max_completion_tokens 32000 "The response to this is a creative novel"
Error: max_completion_tokens
  Extra inputs are not permitted

$ LLM_OPENAI_SHOW_RESPONSES=1 llm --no-stream -m o1-preview -o max_tokens 32000 "The response to this is a creative novel"
Request: POST https://api.openai.com/v1/chat/completions
  Headers:
    host: api.openai.com
    connection: keep-alive
    accept: application/json
    content-type: application/json
    user-agent: OpenAI/Python 1.60.1
    x-stainless-lang: python
    x-stainless-package-version: 1.60.1
    x-stainless-os: Linux
    x-stainless-arch: x64
    x-stainless-runtime: CPython
    x-stainless-runtime-version: 3.10.12
    authorization: [...]
    x-stainless-async: false
    x-stainless-retry-count: 0
    content-length: 138
  Body:
    {
      "messages": [
        {
          "role": "user",
          "content": "The response to this is a creative novel"
        }
      ],
      "model": "o1-preview",
      "max_tokens": 32000,
      "stream": false
    }
Response: status_code=400
  Headers:
    date: Tue, 28 Jan 2025 05:46:32 GMT
    content-type: application/json
    content-length: 245
    connection: keep-alive
    access-control-expose-headers: X-Request-ID
    openai-organization: [...]
    openai-processing-ms: 20
    openai-version: 2020-10-01
    x-ratelimit-limit-requests: 10000
    x-ratelimit-limit-tokens: 30000000
    x-ratelimit-remaining-requests: 9999
    x-ratelimit-remaining-tokens: 29995904
    x-ratelimit-reset-requests: 6ms
    x-ratelimit-reset-tokens: 8ms
    x-request-id: req_[...]
    strict-transport-security: max-age=31536000; includeSubDomains; preload
    cf-cache-status: DYNAMIC
    set-cookie: __cf_bm=...
    x-content-type-options: nosniff
    server: cloudflare
    cf-ray: [...]
    alt-svc: h3=":443"; ma=86400
  Body:
{
  "error": {
    "message": "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.",
    "type": "invalid_request_error"
,
    "param": "max_tokens",
    "code": "unsupported_parameter"
  }
}
Error: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}

$ llm --version
llm, version 0.20

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.5 LTS
Release:	22.04
Codename:	jammy

$ uname -a
Linux [...] 6.8.0-52-generic #53~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Jan 15 19:18:46 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

`llm` Documentation

This documentation can also be updated along with fixing the code, as it lists max_tokens instead of max_completion_tokens for the o1 models:
https://github.com/simonw/llm/blob/main/docs/contributing.md

OpenAI Chat: o1
  Options:
    temperature: float
    max_tokens: int
[...]
OpenAI Chat: o1-2024-12-17
  Options:
    temperature: float
    max_tokens: int
[...]
OpenAI Chat: o1-preview
  Options:
    temperature: float
    max_tokens: int
[...]
OpenAI Chat: o1-mini
  Options:
    temperature: float
    max_tokens: int
[...]

The text was updated successfully, but these errors were encountered:

archer-eric · 2025-01-29T02:29:31Z

extra credit: llm could also support -o max_tokens N as an alias for -o max_completion_tokens N when using an OpenAI model.

archer-eric · 2025-02-02T20:23:02Z

Updated title to include o3 models, which seem to have the same requirements.

simonw · 2025-02-02T20:28:00Z

Urgh. The way I see it there are three options here:

Remove the max_tokens option in favor of max_completion_tokens. This is consistent with OpenAI's API but inconsistent with other models. Users of LLM will have to think about which option to use even though they do the same thing.
Stick with -o max_tokens 100 as the LLM option, send max_completion_tokens to the API. This is better for trying out the same prompt against multiple models and saves users of LLM from having to think about OpenAI's non-standard naming, but is inconsistent with the OpenAI API.
Support both. A bit ugly but does paper over both problems.

I think I like option 3, it makes OpenAI's weird issue visible in the LLM docs but feels the most convenient for users.

This is why designing abstraction layers across multiple models is hard!

bocytko mentioned this issue Feb 1, 2025

o3-mini: unsupported parameters listed in docs/usage.md #731

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI o1 models require `max_completion_tokens` instead of `max_tokens` #724

OpenAI o1 models require `max_completion_tokens` instead of `max_tokens` #724

archer-eric commented Jan 28, 2025

archer-eric commented Jan 29, 2025

archer-eric commented Feb 2, 2025

simonw commented Feb 2, 2025

OpenAI o1 models require max_completion_tokens instead of max_tokens #724

OpenAI o1 models require max_completion_tokens instead of max_tokens #724

Comments

archer-eric commented Jan 28, 2025

Problem

Model Provider Documentation

Examples

llm Documentation

archer-eric commented Jan 29, 2025

archer-eric commented Feb 2, 2025

simonw commented Feb 2, 2025

OpenAI o1 models require `max_completion_tokens` instead of `max_tokens` #724

OpenAI o1 models require `max_completion_tokens` instead of `max_tokens` #724

`llm` Documentation