Fix `vllm:prompt_tokens_total` metric calculation #2869

ronensc · 2024-02-14T13:46:57Z

I have identified an issue with the vllm:prompt_tokens_total counter metric when there are multiple prompts in a batch with different token lengths.

The root cause of the problem lies in the metric counting the token length of the longest prompt in the batch multiplied by the number of prompts in the batch (as if the shorter prompts were padded to match the logest).
Code ref:

vllm/vllm/core/scheduler.py

Lines 262 to 263 in 7e45107

    
           num_batched_tokens=len(seq_lens) * 
        
           max(seq_lens) if seq_lens else 0,

This PR resolves this issue by accurately counting the tokens of all the prompts in the batch. A simple unit test has been added to validate the correctness of the counter.
Additionally, I have fixed _read_prompts() to read all prompts from a file, rather than the first prompt.

NikolaBorisov

LGTM

NikolaBorisov · 2024-02-14T20:17:33Z

tests/conftest.py

@@ -14,11 +14,9 @@


 def _read_prompts(filename: str) -> str:


Looks like this function has wrong signature. output seems to be List[str]

Good catch! fixed.

NikolaBorisov · 2024-02-16T04:28:43Z

@simon-mo This is small fix for stats. Safe to merge

zhuohan123

LGTM! Thank you for your contribution!

ronensc · 2024-02-19T12:32:59Z

I noticed that the 2nd prompt of prompts/example.txt causes some tests to fail.
https://buildkite.com/vllm/ci/builds/1399#018dc04b-091b-46e8-8ac4-6973f0536b82/51-1384

Possibly related to: #975

ronensc added 3 commits February 14, 2024 15:06

Make _read_prompts() to read all prompts from a file

73c800b

Add a test to reveal a bug in counter_prompt_tokens

5229b2d

Fix calculation of num_prompt_tokens

0fff5bb

NikolaBorisov approved these changes Feb 14, 2024

View reviewed changes

Fix function signature

8e0b906

Merge branch 'main' into fix-counter_prompt_tokens

47ae977

zhuohan123 approved these changes Feb 19, 2024

View reviewed changes

zhuohan123 merged commit e433c11 into vllm-project:main Feb 19, 2024
17 of 21 checks passed

ronensc deleted the fix-counter_prompt_tokens branch February 19, 2024 10:59

zhuohan123 mentioned this pull request Feb 20, 2024

[FIX] Fix beam search test #2930

Merged

xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 22, 2024

Fix vllm:prompt_tokens_total metric calculation (vllm-project#2869)

1c21431

andy-neuma mentioned this pull request Feb 23, 2024

andy/bump main to v0.3.2 neuralmagic/nm-vllm#49

Closed

xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 4, 2024

Fix vllm:prompt_tokens_total metric calculation (vllm-project#2869)

6822da7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `vllm:prompt_tokens_total` metric calculation #2869

Fix `vllm:prompt_tokens_total` metric calculation #2869

ronensc commented Feb 14, 2024

NikolaBorisov left a comment

NikolaBorisov Feb 14, 2024

ronensc Feb 15, 2024

NikolaBorisov commented Feb 16, 2024

zhuohan123 left a comment

ronensc commented Feb 19, 2024

	num_batched_tokens=len(seq_lens) *
	max(seq_lens) if seq_lens else 0,

Fix vllm:prompt_tokens_total metric calculation #2869

Fix vllm:prompt_tokens_total metric calculation #2869

Conversation

ronensc commented Feb 14, 2024

NikolaBorisov left a comment

Choose a reason for hiding this comment

NikolaBorisov Feb 14, 2024

Choose a reason for hiding this comment

ronensc Feb 15, 2024

Choose a reason for hiding this comment

NikolaBorisov commented Feb 16, 2024

zhuohan123 left a comment

Choose a reason for hiding this comment

ronensc commented Feb 19, 2024

Fix `vllm:prompt_tokens_total` metric calculation #2869

Fix `vllm:prompt_tokens_total` metric calculation #2869