From b7e98ea0d3f9fb56c9cdcdbe5b5486022142f344 Mon Sep 17 00:00:00 2001 From: Hongxia Yang Date: Tue, 21 Jan 2025 21:11:38 +0000 Subject: [PATCH 1/3] [Documentation][AMD] add information about prebuilt vLLM on ROCm docker Signed-off-by: Hongxia Yang --- docs/source/getting_started/installation/gpu/rocm.inc.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/docs/source/getting_started/installation/gpu/rocm.inc.md b/docs/source/getting_started/installation/gpu/rocm.inc.md index 8ef1bc95fd522..353db940a8c0e 100644 --- a/docs/source/getting_started/installation/gpu/rocm.inc.md +++ b/docs/source/getting_started/installation/gpu/rocm.inc.md @@ -13,6 +13,13 @@ vLLM supports AMD GPUs with ROCm 6.2. Currently, there are no pre-built ROCm wheels. +However, the ROCm [ROCm Hub for vLLM](https://hub.docker.com/r/rocm/vllm/tags) offers a prebuilt, optimized environment +designed for validating large language model (LLM) inference performance on the AMD Instinct™ MI300X accelerator. + +```{tip} +Please check [LLM inference performance validation on AMD Instinct MI300X] (https://rocm.docs.amd.com/en/latest/how-to/performance-validation/mi300x/vllm-benchmark.html) for instructions on how to use this prebuilt docker image to validate inference performance on MI300x accelerator. +``` + ### Build wheel from source 0. Install prerequisites (skip if you are already in an environment/docker with the following installed): From 991292792c4fd28482fe3b096fe1e627e506fac5 Mon Sep 17 00:00:00 2001 From: Hongxia Yang Date: Tue, 21 Jan 2025 21:31:12 +0000 Subject: [PATCH 2/3] update wording Signed-off-by: Hongxia Yang --- docs/source/getting_started/installation/gpu/rocm.inc.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/source/getting_started/installation/gpu/rocm.inc.md b/docs/source/getting_started/installation/gpu/rocm.inc.md index 353db940a8c0e..09b79a436879c 100644 --- a/docs/source/getting_started/installation/gpu/rocm.inc.md +++ b/docs/source/getting_started/installation/gpu/rocm.inc.md @@ -13,11 +13,12 @@ vLLM supports AMD GPUs with ROCm 6.2. Currently, there are no pre-built ROCm wheels. -However, the ROCm [ROCm Hub for vLLM](https://hub.docker.com/r/rocm/vllm/tags) offers a prebuilt, optimized environment -designed for validating large language model (LLM) inference performance on the AMD Instinct™ MI300X accelerator. +However, the [ROCm Hub for vLLM](https://hub.docker.com/r/rocm/vllm/tags) offers a prebuilt, optimized +docker image designed for validating inference performance on the AMD Instinct™ MI300X accelerator. ```{tip} -Please check [LLM inference performance validation on AMD Instinct MI300X] (https://rocm.docs.amd.com/en/latest/how-to/performance-validation/mi300x/vllm-benchmark.html) for instructions on how to use this prebuilt docker image to validate inference performance on MI300x accelerator. +Please check [LLM inference performance validation on AMD Instinct MI300X](https://rocm.docs.amd.com/en/latest/how-to/performance-validation/mi300x/vllm-benchmark.html) +for instructions on how to use this prebuilt docker image. ``` ### Build wheel from source From 6ef058a200e22a8f789fa9707ef8e2826302486e Mon Sep 17 00:00:00 2001 From: Hongxia Yang Date: Tue, 21 Jan 2025 21:53:26 +0000 Subject: [PATCH 3/3] minor update on wording Signed-off-by: Hongxia Yang --- docs/source/getting_started/installation/gpu/rocm.inc.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/getting_started/installation/gpu/rocm.inc.md b/docs/source/getting_started/installation/gpu/rocm.inc.md index 09b79a436879c..69238f6e36fb2 100644 --- a/docs/source/getting_started/installation/gpu/rocm.inc.md +++ b/docs/source/getting_started/installation/gpu/rocm.inc.md @@ -13,7 +13,7 @@ vLLM supports AMD GPUs with ROCm 6.2. Currently, there are no pre-built ROCm wheels. -However, the [ROCm Hub for vLLM](https://hub.docker.com/r/rocm/vllm/tags) offers a prebuilt, optimized +However, the [AMD Infinity hub for vLLM](https://hub.docker.com/r/rocm/vllm/tags) offers a prebuilt, optimized docker image designed for validating inference performance on the AMD Instinct™ MI300X accelerator. ```{tip}