Replies: 1 comment
-
Hi @ainilian, The slowness you’re experiencing could be related to the default inference setup. For faster evaluation, I recommend using lmdeploy or vllm, which are optimized for better performance on large models like You can refer to the following configurations to get started:
Additionally, it seems like you are testing a base model ( Let me know if you need further assistance! |
Beta Was this translation helpful? Give feedback.
-
command
CUDA_VISIBLE_DEVICES=4,5,6,7 python run.py configs/eval_needlebench.py --max-num-workers 4
gpu a100-40g
configs/eval_needlebench.py
Beta Was this translation helpful? Give feedback.
All reactions