Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial support for intel Gaudi accelerators #2121

Merged
merged 3 commits into from
Nov 23, 2024

Conversation

ankurneog
Copy link
Contributor

Motivation

Intel gaudi ( device name : hpu) has PyTorch support as an Out-of-tree device. In addition support for Triton is also available . With this initial PR we introduce intel gaudi ( device : hpu) to SGLang
More details on Intel Gaudi SW installation can be found here :
https://docs.habana.ai/en/latest/PyTorch/Getting_Started_with_PyTorch_and_Gaudi/Getting_Started_with_PyTorch.html#getting-started-pyt-model
Triton support
https://docs.habana.ai/en/latest/PyTorch/Inference_on_PyTorch/Triton_Inference.html

@ankurneog
Copy link
Contributor Author

@liangan1 : can you have a look at the initial changes. thanks

@@ -186,6 +186,9 @@ def init_torch_distributed(self):
elif self.device == "xpu":
torch.xpu.set_device(self.gpu_id)
backend = "gloo"
elif self.device == "hpu":
torch.get_device_module(self.device).set_device(self.gpu_id)
backend = "hccl"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refine other device path with "torch.get_device_module(self.device).set_device(self.gpu_id)" may be a better choice to make the code simplifier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done , this will be further simplified once pytorch/pytorch#140536 is available, that way we dont have any device name references in the file.

@ankurneog ankurneog force-pushed the intel_gaudi_support_1 branch from ba8495f to 91c7594 Compare November 22, 2024 09:05
@ankurneog ankurneog changed the title Add intial support for intel gaudi accelerators Add initial support for intel Gaudi accelerators Nov 22, 2024
@merrymercy merrymercy merged commit 865233e into sgl-project:main Nov 23, 2024
13 checks passed
@merrymercy merrymercy mentioned this pull request Nov 24, 2024
37 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants