Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Does VLLM only support MistralModel Architecture for embedding? #7915

Closed
hahmad2008 opened this issue Aug 27, 2024 · 4 comments
Closed

Comments

@hahmad2008
Copy link

hahmad2008 commented Aug 27, 2024

🚀 The feature, motivation and pitch

Does VLLM only support MistralModel Architecture for embedding?

_EMBEDDING_MODELS = {
    "MistralModel": ("llama_embedding", "LlamaEmbeddingModel"),
}

I tried to force embedding_mode to true model_config.embedding_mode = True, this error raised:

Activating the server engine with embedding enabled.
INFO 08-27 14:54:06 async_llm_engine.py:173] Added request embd-69a08211c22a4db9baa14c2da3db9dcd-0.
ERROR 08-27 14:54:06 async_llm_engine.py:56] Engine background task failed
ERROR 08-27 14:54:06 async_llm_engine.py:56] Traceback (most recent call last):
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/site-packages/vllm/engine/async_llm_engine.py", line 46, in _log_task_completion
ERROR 08-27 14:54:06 async_llm_engine.py:56]     return_value = task.result()
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/site-packages/vllm/engine/async_llm_engine.py", line 637, in run_engine_loop
ERROR 08-27 14:54:06 async_llm_engine.py:56]     result = task.result()
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/site-packages/vllm/engine/async_llm_engine.py", line 578, in engine_step
ERROR 08-27 14:54:06 async_llm_engine.py:56]     request_outputs = await self.engine.step.remote()  # type: ignore
ERROR 08-27 14:54:06 async_llm_engine.py:56] ray.exceptions.RayTaskError(AttributeError): ray::_AsyncLLMEngine.step() (pid=40485, ip=10.5.8.112, actor_id=da65e597172ea5f7dea0a8b601000000, repr=<vllm.engine.async_llm_engine._AsyncLLMEngine object at 0x7fd4c6e27250>)
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/concurrent/futures/_base.py", line 439, in result
ERROR 08-27 14:54:06 async_llm_engine.py:56]     return self.__get_result()
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
ERROR 08-27 14:54:06 async_llm_engine.py:56]     raise self._exception
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/site-packages/vllm/engine/llm_engine.py", line 911, in step
ERROR 08-27 14:54:06 async_llm_engine.py:56]     output = self.model_executor.execute_model(
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/site-packages/vllm/executor/ray_gpu_executor.py", line 273, in execute_model
ERROR 08-27 14:54:06 async_llm_engine.py:56]     return super().execute_model(execute_model_req)
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/site-packages/vllm/executor/distributed_gpu_executor.py", line 76, in execute_model
ERROR 08-27 14:54:06 async_llm_engine.py:56]     driver_outputs = self._driver_execute_model(execute_model_req)
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/site-packages/vllm/executor/ray_gpu_executor.py", line 266, in _driver_execute_model
ERROR 08-27 14:54:06 async_llm_engine.py:56]     return self.driver_worker.execute_method("execute_model",
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/site-packages/vllm/worker/worker_base.py", line 383, in execute_method
ERROR 08-27 14:54:06 async_llm_engine.py:56]     raise e
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/site-packages/vllm/worker/worker_base.py", line 374, in execute_method
ERROR 08-27 14:54:06 async_llm_engine.py:56]     return executor(*args, **kwargs)
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/site-packages/vllm/worker/worker_base.py", line 236, in execute_model
ERROR 08-27 14:54:06 async_llm_engine.py:56]     self.model_runner.prepare_model_input(
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/site-packages/vllm/worker/model_runner.py", line 1227, in prepare_model_input
ERROR 08-27 14:54:06 async_llm_engine.py:56]     sampling_metadata = SamplingMetadata.prepare(seq_group_metadata_list,
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/site-packages/vllm/model_executor/sampling_metadata.py", line 126, in prepare
ERROR 08-27 14:54:06 async_llm_engine.py:56]     ) = _prepare_seq_groups(seq_group_metadata_list, seq_lens, query_lens,
ERROR 08-27 14:54:06 async_llm_engine.py:56]   File "myenv/lib/python3.9/site-packages/vllm/model_executor/sampling_metadata.py", line 218, in _prepare_seq_groups
ERROR 08-27 14:54:06 async_llm_engine.py:56]     if sampling_params.seed is not None:
ERROR 08-27 14:54:06 async_llm_engine.py:56] AttributeError: 'NoneType' object has no attribute 'seed'
@mgoin
Copy link
Member

mgoin commented Aug 27, 2024

Yes, support for other models for embedding use needs to be added.

@hahmad2008
Copy link
Author

@mgoin does it need implementation? or it has the same implementation with the current code?
Also can I use the Meta-Llama-3-8B-Instructfor embedding? or only those models that are designed to be embedding models?

Copy link

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

@github-actions github-actions bot added the stale label Nov 27, 2024
@mgoin
Copy link
Member

mgoin commented Nov 27, 2024

The support for embedding models has greatly improved now! https://docs.vllm.ai/en/latest/models/supported_models.html#text-embedding

@mgoin mgoin closed this as completed Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants
@mgoin @hahmad2008 and others