From 3d603b41c5d48eb611c6eb74a4d7332682099cb3 Mon Sep 17 00:00:00 2001 From: DarkLight1337 Date: Mon, 25 Nov 2024 13:05:54 +0000 Subject: [PATCH 1/2] Fix missing code block Signed-off-by: DarkLight1337 --- docs/source/models/supported_models.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/models/supported_models.rst b/docs/source/models/supported_models.rst index 54e2c4479c2c9..d0fde44a697f4 100644 --- a/docs/source/models/supported_models.rst +++ b/docs/source/models/supported_models.rst @@ -365,7 +365,7 @@ Text Embedding .. note:: Unlike base Qwen2, :code:`Alibaba-NLP/gte-Qwen2-7B-instruct` uses bi-directional attention. - You can set `--hf-overrides '{"is_causal": false}'` to change the attention mask accordingly. + You can set :code:`--hf-overrides '{"is_causal": false}'` to change the attention mask accordingly. On the other hand, its 1.5B variant (:code:`Alibaba-NLP/gte-Qwen2-1.5B-instruct`) uses causal attention despite being described otherwise on its model card. From 8086749d33d67ce8098cc8eec36647c127b4dca7 Mon Sep 17 00:00:00 2001 From: DarkLight1337 Date: Mon, 25 Nov 2024 13:20:16 +0000 Subject: [PATCH 2/2] Actually this is just untested Signed-off-by: DarkLight1337 --- docs/source/serving/compatibility_matrix.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/serving/compatibility_matrix.rst b/docs/source/serving/compatibility_matrix.rst index a4300761d2635..fa03d2cde1486 100644 --- a/docs/source/serving/compatibility_matrix.rst +++ b/docs/source/serving/compatibility_matrix.rst @@ -393,7 +393,7 @@ Feature x Hardware - ✅ - ✅ - ✅ - - ✗ + - ? * - :abbr:`enc-dec (Encoder-Decoder Models)` - ✅ - ✅