[Hardware] [HPU]add `mark_step` for hpu #10239

jikunshang · 2024-11-12T03:04:52Z

This PR add mark_step after each model's decoder layer, which could benefit performance and not break origin model files.
mark_step doc

cc @kzawora-intel

github-actions · 2024-11-12T03:05:08Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

Signed-off-by: Kunshang Ji <[email protected]>

madamczykhabana · 2024-11-12T11:43:31Z

Hi @jikunshang . Could the same functionality be implemented using stock PT functionalities like forward hooks (https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.register_forward_hook)? I guess we only need to call mark_step() and there's no other complex logic involved.

Signed-off-by: Kunshang Ji <[email protected]>

jikunshang · 2024-11-13T00:15:33Z

Hi @jikunshang . Could the same functionality be implemented using stock PT functionalities like forward hooks (https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.register_forward_hook)? I guess we only need to call mark_step() and there's no other complex logic involved.

nice catch, addressed your comments. please take a look again, thanks!

WoosukKwon · 2024-11-13T17:34:27Z

@madamczykhabana Please take a look. I will merge the PR once you approve it.

madamczykhabana

LGTM

jikunshang · 2024-11-15T01:20:32Z

@WoosukKwon Can we merge this now?

DarkLight1337 · 2024-11-17T08:30:22Z

Sorry for missing this!

Signed-off-by: Kunshang Ji <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

Signed-off-by: Kunshang Ji <[email protected]>

Signed-off-by: Kunshang Ji <[email protected]> Signed-off-by: Maxime Fournioux <[email protected]>

Signed-off-by: Kunshang Ji <[email protected]> Signed-off-by: rickyx <[email protected]>

Signed-off-by: Kunshang Ji <[email protected]> Signed-off-by: Tyler Michael Smith <[email protected]>

We are seeing 10% performance regression in the llama-based model due to vllm-project#10239. The mark_step() function needs to be configured differently for each model to achieve the best performance. For some models, mark_step() for every decoder step would be optimal, but for other models, it's better to run it every n-th step. We are adding a counter to only register the hook for every n-th step, which can be configured with VLLM_CONFIG_HIDDEN_LAYERS

Signed-off-by: Kunshang Ji <[email protected]>

add mark_step for hpu

4cd3598

Signed-off-by: Kunshang Ji <[email protected]>

jikunshang force-pushed the mk_step branch from 995a1a5 to 4cd3598 Compare November 12, 2024 03:23

DarkLight1337 requested a review from WoosukKwon November 12, 2024 09:16

address comments

7e1e8a0

Signed-off-by: Kunshang Ji <[email protected]>

WoosukKwon approved these changes Nov 13, 2024

View reviewed changes

WoosukKwon added the Gaudi label Nov 13, 2024

madamczykhabana approved these changes Nov 14, 2024

View reviewed changes

DarkLight1337 enabled auto-merge (squash) November 17, 2024 08:30

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 17, 2024

DarkLight1337 disabled auto-merge November 17, 2024 08:30

DarkLight1337 enabled auto-merge (squash) November 17, 2024 08:30

WoosukKwon disabled auto-merge November 17, 2024 08:44

WoosukKwon merged commit 76aab90 into vllm-project:main Nov 17, 2024
36 of 50 checks passed

lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Nov 18, 2024

[Hardware] [HPU]add mark_step for hpu (vllm-project#10239)

305708b

Signed-off-by: Kunshang Ji <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

jiminha mentioned this pull request Nov 20, 2024

[HPU] Add mark_step configurable for the decoder layer. HabanaAI/vllm-fork#525

Merged

coolkp pushed a commit to coolkp/vllm that referenced this pull request Nov 20, 2024

[Hardware] [HPU]add mark_step for hpu (vllm-project#10239)

860bd51

Signed-off-by: Kunshang Ji <[email protected]>

KuntaiDu pushed a commit to KuntaiDu/vllm that referenced this pull request Nov 20, 2024

[Hardware] [HPU]add mark_step for hpu (vllm-project#10239)

6d9f042

Signed-off-by: Kunshang Ji <[email protected]>

mfournioux pushed a commit to mfournioux/vllm that referenced this pull request Nov 20, 2024

[Hardware] [HPU]add mark_step for hpu (vllm-project#10239)

ac3a0da

Signed-off-by: Kunshang Ji <[email protected]> Signed-off-by: Maxime Fournioux <[email protected]>

rickyyx pushed a commit to rickyyx/vllm that referenced this pull request Nov 20, 2024

[Hardware] [HPU]add mark_step for hpu (vllm-project#10239)

c44141f

Signed-off-by: Kunshang Ji <[email protected]> Signed-off-by: rickyx <[email protected]>

tlrmchlsmth pushed a commit to neuralmagic/vllm that referenced this pull request Nov 23, 2024

[Hardware] [HPU]add mark_step for hpu (vllm-project#10239)

d2a04af

Signed-off-by: Kunshang Ji <[email protected]> Signed-off-by: Tyler Michael Smith <[email protected]>

jiminha mentioned this pull request Nov 26, 2024

[HPU] Add mark_step configurable for the decoder layer HabanaAI/vllm-fork#548

Closed

prashantgupta24 pushed a commit to opendatahub-io/vllm that referenced this pull request Dec 3, 2024

[Hardware] [HPU]add mark_step for hpu (vllm-project#10239)

f6dc8be

Signed-off-by: Kunshang Ji <[email protected]>

sleepwalker2017 pushed a commit to sleepwalker2017/vllm that referenced this pull request Dec 13, 2024

[Hardware] [HPU]add mark_step for hpu (vllm-project#10239)

5884cba

Signed-off-by: Kunshang Ji <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hardware] [HPU]add `mark_step` for hpu #10239

[Hardware] [HPU]add `mark_step` for hpu #10239

jikunshang commented Nov 12, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Nov 12, 2024

madamczykhabana commented Nov 12, 2024 •

edited

Loading

jikunshang commented Nov 13, 2024

WoosukKwon commented Nov 13, 2024

madamczykhabana left a comment

jikunshang commented Nov 15, 2024

DarkLight1337 commented Nov 17, 2024

[Hardware] [HPU]add mark_step for hpu #10239

[Hardware] [HPU]add mark_step for hpu #10239

Conversation

jikunshang commented Nov 12, 2024 • edited by github-actions bot Loading

github-actions bot commented Nov 12, 2024

madamczykhabana commented Nov 12, 2024 • edited Loading

jikunshang commented Nov 13, 2024

WoosukKwon commented Nov 13, 2024

madamczykhabana left a comment

Choose a reason for hiding this comment

jikunshang commented Nov 15, 2024

DarkLight1337 commented Nov 17, 2024

[Hardware] [HPU]add `mark_step` for hpu #10239

[Hardware] [HPU]add `mark_step` for hpu #10239

jikunshang commented Nov 12, 2024 •

edited by github-actions bot

Loading

madamczykhabana commented Nov 12, 2024 •

edited

Loading