Skip to content

Commit

Permalink
Warning for missing FP8 checkpoint support for vLLM deployment (NVIDI…
Browse files Browse the repository at this point in the history
…A#10906)

Signed-off-by: Jan Lasek <[email protected]>
  • Loading branch information
janekl authored and artbataev committed Oct 22, 2024
1 parent 9a900ec commit ca66a78
Showing 1 changed file with 5 additions and 0 deletions.
5 changes: 5 additions & 0 deletions nemo/export/vllm_exporter.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,11 @@ def export(
max_seq_len_to_capture=None,
)

if model_config.nemo_model_config.get("fp8", False):
LOGGER.warning(
"NeMo FP8 checkpoint detected, but exporting FP8 quantized engines is not supported for vLLM."
)

parallel_config = ParallelConfig(
pipeline_parallel_size=pipeline_parallel_size, tensor_parallel_size=tensor_parallel_size
)
Expand Down

0 comments on commit ca66a78

Please sign in to comment.