昇腾lmdeploy使用 lmdeploy APIClient 接口时，推理结果被截断 #2969

winni0 · 2024-12-28T08:26:13Z

Checklist

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

昇腾lmdeploy使用lmdeploy serve api_server
/LLaMA-Factory-main/model/Qwen2.5-7B-Instruct
--backend pytorch
--server-port 8000
--device ascend
--session-len 8192启动服务，使用 lmdeploy APIClient 接口接收结果时，推理结果被截断。

Reproduction

API服务启动命令如下：lmdeploy serve api_server
/LLaMA-Factory-main/model/Qwen2.5-7B-Instruct
--backend pytorch
--server-port 8000
--device ascend
--session-len 8192，lmdeploy APIClient 接口代码如下：

Environment

TorchVision: 0.18.1
LMDeploy: 0.6.4+191a7dd
transformers: 4.47.1
gradio: Not Found
fastapi: 0.115.6
pydantic: 2.10.4
triton: Not Found

Error traceback

没有报错

jinminxi104 · 2024-12-31T15:32:44Z

please add max_tokens into completions_v1. (please also note that you are using graph mode, which cause a compile phase at first run.)

winni0 · 2025-01-03T08:14:55Z

你这个回答结果也是被截断的呀

jinminxi104 · 2025-01-04T07:23:52Z

你这个回答结果也是被截断的呀

我也设定了截断的值。你可以设大，比如设定2k

MiningIrving · 2025-01-05T09:45:19Z

我也设定了截断的值。你可以设大，比如设定2k

请问，现在支持qwen2.5:14B及以上参数的模型吗

jinminxi104 · 2025-01-05T15:39:28Z

我也设定了截断的值。你可以设大，比如设定2k

请问，现在支持qwen2.5:14B及以上参数的模型吗

支持，请设定合适的tp值。

jinminxi104 self-assigned this Dec 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

昇腾lmdeploy使用 lmdeploy APIClient 接口时，推理结果被截断 #2969

昇腾lmdeploy使用 lmdeploy APIClient 接口时，推理结果被截断 #2969

winni0 commented Dec 28, 2024

jinminxi104 commented Dec 31, 2024

winni0 commented Jan 3, 2025

jinminxi104 commented Jan 4, 2025

MiningIrving commented Jan 5, 2025 •

edited

Loading

jinminxi104 commented Jan 5, 2025

昇腾lmdeploy使用 lmdeploy APIClient 接口时，推理结果被截断 #2969

昇腾lmdeploy使用 lmdeploy APIClient 接口时，推理结果被截断 #2969

Comments

winni0 commented Dec 28, 2024

Checklist

Describe the bug

Reproduction

Environment

Error traceback

jinminxi104 commented Dec 31, 2024

winni0 commented Jan 3, 2025

jinminxi104 commented Jan 4, 2025

MiningIrving commented Jan 5, 2025 • edited Loading

jinminxi104 commented Jan 5, 2025

MiningIrving commented Jan 5, 2025 •

edited

Loading