Semantic conventions for LLM model server metrics #1102
Labels
area:gen-ai
enhancement
New feature or request
experts needed
This issue or pull request is outside an area where general approvers feel they can approve
Area(s)
area:gen-ai
Is your change request related to a problem? Please describe.
GenAI metrics today has client metrics. It would be good to add model server metrics to provide a standard set of performance and operational metrics for model servers. We have a proposal to standardize them as a part of Kubernetes Serving WG - https://docs.google.com/document/d/1SpSp1E6moa4HSrJnS4x3NpLuj88sMXr2tbofKlzTZpk/edit?resourcekey=0-ob5dR-AJxLQ5SvPlA4rdsg&tab=t.0#heading=h.qmzyorj64um1 (please request access if you are not able to access it). Having them standardized via LLM semantic conventions in open telemetry would be ideal.
Previous discussion on this in the LLM Semantic Conventions WG for context - https://docs.google.com/document/d/1EKIeDgBGXQPGehUigIRLwAUpRGa7-1kXB736EaYuJ2M/edit#bookmark=id.f4yu21sfndxj.
Describe the solution you'd like
Solution is to include these common model server metrics in the GenAI LLM semantic conventions
Describe alternatives you've considered
No response
Additional context
cc @lmolkova @SergeyKanzhelev
The text was updated successfully, but these errors were encountered: