Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support sm75 with FlashInfer v0.1.6 #1233

Merged
merged 3 commits into from
Aug 28, 2024
Merged

Conversation

zhyncs
Copy link
Member

@zhyncs zhyncs commented Aug 27, 2024

Motivation

support T4

Await https://github.com/flashinfer-ai/flashinfer/releases/tag/v0.1.6

Modifications

Checklist

  • Format your code according to the Contributor Guide.
  • Add unit tests as outlined in the Contributor Guide.
  • Update documentation as needed, including docstrings or example tutorials.

@zhyncs zhyncs added the wip label Aug 27, 2024
@zhyncs zhyncs self-assigned this Aug 27, 2024
@zhyncs zhyncs marked this pull request as draft August 27, 2024 13:48
@zhyncs zhyncs marked this pull request as ready for review August 28, 2024 07:32
@zhyncs zhyncs removed the wip label Aug 28, 2024
@zhyncs
Copy link
Member Author

zhyncs commented Aug 28, 2024

python3 -m sglang.launch_server --model Qwen/Qwen2-1.5B-Instruct --mem-fraction-static 0.7

@zhyncs zhyncs merged commit 198974c into sgl-project:main Aug 28, 2024
8 checks passed
@zhyncs zhyncs deleted the fi branch August 28, 2024 08:39
@zhyncs
Copy link
Member Author

zhyncs commented Aug 28, 2024

Hi @horiacristescu May you try the latest main on T4? Thanks.

# Use the last main branch
git clone https://github.com/sgl-project/sglang.git
cd sglang

pip install --upgrade pip
pip install -e "python[all]"

# Install FlashInfer CUDA kernels
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/

python3 -m sglang.launch_server --model Qwen/Qwen2-1.5B-Instruct --mem-fraction-static 0.7

It works for me on GCP T4.

@zhyncs zhyncs mentioned this pull request Aug 28, 2024
4 tasks
@zhyncs zhyncs mentioned this pull request Sep 4, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants