Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi step scheduling #441

Merged
merged 37 commits into from
Oct 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
0ac1c86
WIP
tzielinski-habana Oct 4, 2024
0fbb074
WIP - decoding w ostatnim kroku zamiast w pierwszym
tzielinski-habana Oct 10, 2024
4e2977a
Using _prepare_decode
tzielinski-habana Oct 10, 2024
4c6ed38
Prawie sensowne wyniki, tylko źle się aktualizują pozycje
tzielinski-habana Oct 11, 2024
ae8d921
Dostaję dobre token_ids na końcu, ale output tekstowy jest bez sensu.…
tzielinski-habana Oct 11, 2024
e83a264
Batch 2 dziala
tzielinski-habana Oct 14, 2024
132a260
Tekstowy output prawie OK, ale za krotki
tzielinski-habana Oct 14, 2024
36b0135
Wszystko zdaje sie dzialac
tzielinski-habana Oct 15, 2024
5e01447
cleanup
tzielinski-habana Oct 16, 2024
cd65486
more cleanup
tzielinski-habana Oct 17, 2024
944f3c9
Minor changes
tzielinski-habana Oct 18, 2024
4aa4157
fix for positions update
tzielinski-habana Oct 18, 2024
12a03dc
Acc > 100% w tescie mlperfowym
tzielinski-habana Oct 23, 2024
88ddfa7
Usuniety sync point
tzielinski-habana Oct 24, 2024
89a53bc
Remove CPU sync before Sampler (#414)
kdamaszk Oct 22, 2024
247cf4c
WIP
tzielinski-habana Oct 4, 2024
14806af
WIP - decoding w ostatnim kroku zamiast w pierwszym
tzielinski-habana Oct 10, 2024
1dac168
Using _prepare_decode
tzielinski-habana Oct 10, 2024
3cce0c6
Prawie sensowne wyniki, tylko źle się aktualizują pozycje
tzielinski-habana Oct 11, 2024
1eab92f
Dostaję dobre token_ids na końcu, ale output tekstowy jest bez sensu.…
tzielinski-habana Oct 11, 2024
9113e1c
Batch 2 dziala
tzielinski-habana Oct 14, 2024
0619ae4
Tekstowy output prawie OK, ale za krotki
tzielinski-habana Oct 14, 2024
0beec98
Wszystko zdaje sie dzialac
tzielinski-habana Oct 15, 2024
86f1b9c
cleanup
tzielinski-habana Oct 16, 2024
7794983
more cleanup
tzielinski-habana Oct 17, 2024
3311156
Minor changes
tzielinski-habana Oct 18, 2024
ff0d81c
fix for positions update
tzielinski-habana Oct 18, 2024
26171bd
Acc > 100% w tescie mlperfowym
tzielinski-habana Oct 23, 2024
852788d
Usuniety sync point
tzielinski-habana Oct 24, 2024
e0d5ce9
add block_groups to execute_model
jmaksymczuk Oct 25, 2024
0f0934c
Merge branch 'multi_step_2' into private/jmaksymczuk/multi_step_2
jmaksymczuk Oct 25, 2024
d055ce8
block_groups (#434)
tzielinski-habana Oct 25, 2024
9fe2662
cleanup
tzielinski-habana Oct 25, 2024
d742bb0
formatter improvements
tzielinski-habana Oct 25, 2024
a7644bc
cleanup, modyfikacja + wykorzystanie _prepare_decode
jmaksymczuk Oct 28, 2024
6a7b045
cleanup
tzielinski-habana Oct 28, 2024
3d815a5
Merge branch 'habana_main' into multi_step_2
tzielinski-habana Oct 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions vllm/executor/hpu_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,9 +54,16 @@ def _create_worker(self,
local_rank: int = 0,
rank: int = 0,
distributed_init_method: Optional[str] = None):
if self.scheduler_config.is_multi_step:
module_name = "vllm.worker.multi_step_hpu_worker"
class_name = "MultiStepHPUWorker"
else:
module_name = "vllm.worker.hpu_worker"
class_name = "HPUWorker"

wrapper = WorkerWrapperBase(
worker_module_name="vllm.worker.hpu_worker",
worker_class_name="HPUWorker",
worker_module_name=module_name,
worker_class_name=class_name,
)
wrapper.init_worker(**self._get_worker_kwargs(local_rank, rank,
distributed_init_method))
Expand Down
4 changes: 2 additions & 2 deletions vllm/executor/ray_hpu_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,8 +87,8 @@ def _get_worker_module_and_class(
Type[WorkerBase]]]]: # noqa: F821
worker_class_fn = None
if self.scheduler_config.is_multi_step:
raise NotImplementedError(
"Multi-step execution is not implemented for HPU")
worker_module_name = "vllm.worker.multi_step_hpu_worker"
worker_class_name = "MultiStepHPUWorker"
elif self.speculative_config:
raise NotImplementedError(
"Speculative decoding is not implemented for HPU")
Expand Down
Loading
Loading