Semantic conventions for LLM model server metrics #1102

achandrasekar · 2024-05-31T18:12:30Z

Area(s)

area:gen-ai

Is your change request related to a problem? Please describe.

GenAI metrics today has client metrics. It would be good to add model server metrics to provide a standard set of performance and operational metrics for model servers. We have a proposal to standardize them as a part of Kubernetes Serving WG - https://docs.google.com/document/d/1SpSp1E6moa4HSrJnS4x3NpLuj88sMXr2tbofKlzTZpk/edit?resourcekey=0-ob5dR-AJxLQ5SvPlA4rdsg&tab=t.0#heading=h.qmzyorj64um1 (please request access if you are not able to access it). Having them standardized via LLM semantic conventions in open telemetry would be ideal.

Previous discussion on this in the LLM Semantic Conventions WG for context - https://docs.google.com/document/d/1EKIeDgBGXQPGehUigIRLwAUpRGa7-1kXB736EaYuJ2M/edit#bookmark=id.f4yu21sfndxj.

Describe the solution you'd like

Solution is to include these common model server metrics in the GenAI LLM semantic conventions

Describe alternatives you've considered

No response

Additional context

cc @lmolkova @SergeyKanzhelev

gyliu513 · 2024-06-04T18:50:43Z

Related with #1079

/cc @nirga

achandrasekar added enhancement New feature or request experts needed This issue or pull request is outside an area where general approvers feel they can approve triage:needs-triage labels May 31, 2024

github-actions bot assigned AlexanderWert May 31, 2024

github-actions bot added the area:gen-ai label May 31, 2024

This was referenced May 31, 2024

Add LLM model server metrics #1103

Merged

[Feature]: Additional metrics to enable better autoscaling / load balancing of vLLM servers in Kubernetes vllm-project/vllm#5041

Closed

lmolkova removed the triage:needs-triage label Jun 17, 2024

joaopgrassi closed this as completed in #1103 Jun 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Semantic conventions for LLM model server metrics #1102

Semantic conventions for LLM model server metrics #1102

achandrasekar commented May 31, 2024

gyliu513 commented Jun 4, 2024 •

edited

Loading

Semantic conventions for LLM model server metrics #1102

Semantic conventions for LLM model server metrics #1102

Comments

achandrasekar commented May 31, 2024

Area(s)

Is your change request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

gyliu513 commented Jun 4, 2024 • edited Loading

gyliu513 commented Jun 4, 2024 •

edited

Loading