-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Fix inappropriate content of model_name tag in Prometheus metrics #3937
Conversation
… match `served_model_name`.
24c7b63
to
9ad067d
Compare
vllm/engine/arg_utils.py
Outdated
attr: getattr(args, attr) | ||
for attr in attrs if hasattr(args, attr) | ||
}) | ||
if hasattr(args, "served_model_name"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be simplified to
if engine_args.served_model_name is None:
engine_args.served_model_name = args.model
The above dict comprehension would have already set engine_args.served_model_name
if served_model_name
was an attribute on args
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your suggestion!
I adopted this approach as a means to adapt to vllm.entrypoint.apiserver
. For now, served_model_name
only appears in vllm.entrypoint.openai.apiserver
and currently has no description, when using vllm.entrypoint.apiserver
, served_model_name
is not an attribute of args
.
Perhaps we could consider to complete served_model_name
parameter in vllm.entrypoint.apiserver
as another approach? If so, the implementation here could be simplified as suggested. And this change will not affect anyone using AsyncEngineArgs
/EngineArgs
directly.
I believe served_model_name
is useful for all scenarios, if it can not be set, like in vllm.entrypoint.apiserver
, sometimes a local path will be treated as the model's label, this doesn't appear to be the right way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DearPlanet vllm.entrypoint.apiserver
is meant as more of a toy example and not being actively improved.
That said I agree if we are using this name for internal purposes it makes more sense to move it into EngineArgs
, would you be ok making that change? And could augment its existing description with a mention that the first name will also be used in the metrics output. This should also address some of the latest CI test failures.
And then maybe best for the check to be if not engine_args.served_model_name:
just in case it's an empty list (in theory it shouldn't be).
6b56bde
to
a0e7407
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DearPlanet thank you very much for the contribution!
Would you also be willing to add a unit test for this, hopefully it should be straightforward to extend the existing test_metrics.py.
vllm/engine/arg_utils.py
Outdated
attr: getattr(args, attr) | ||
for attr in attrs if hasattr(args, attr) | ||
}) | ||
if hasattr(args, "served_model_name"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DearPlanet vllm.entrypoint.apiserver
is meant as more of a toy example and not being actively improved.
That said I agree if we are using this name for internal purposes it makes more sense to move it into EngineArgs
, would you be ok making that change? And could augment its existing description with a mention that the first name will also be used in the metrics output. This should also address some of the latest CI test failures.
And then maybe best for the check to be if not engine_args.served_model_name:
just in case it's an empty list (in theory it shouldn't be).
vllm/engine/arg_utils.py
Outdated
for attr in attrs if hasattr(args, attr) | ||
}) | ||
if hasattr(args, "served_model_name"): | ||
engine_args.served_model_name = args.served_model_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a multi-valued arg and will be a list
engine_args.served_model_name = args.served_model_name | |
engine_args.served_model_name = args.served_model_name[0] |
This reverts commit c6adebc.
…d descriptions for all param entries
Thanks for suggestions @njhill!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @DearPlanet, just have couple of minor suggestions
Co-authored-by: Nick Hill <[email protected]>
Co-authored-by: Nick Hill <[email protected]>
db5e462
to
7ea83a8
Compare
Thank you again for the suggestions @njhill !😊 |
Description
When deploying a local model, such as one located at a path like
/mnt/Qwen1.5-14B-chat/
, the Prometheus metrics output exposes this model path information.The metrics output looks like this:
This raises two issues:
model_name
in metrics does not correspond to the model name provided inModelCard
byopenai.api_server
.This PR solves the problem, by passing the
served_model_name
parameter intoModelConfig
and parsing it in theLLMEngine
. For scenarios not usingopenai.apiserver
,served_model_name
can also be passed toAsyncEngineArgs
/EngineArgs
. And whenserved_model_name
parameter is not specified, the Prometheus metrics output will behave as before (although I still do not recommend this).