TCEval v2 Install cd lm-evaluation-harness_mr-revised pip3 install -e ".[vllm]" pip3 install -U vllm cd .. Evaluate Local Models (MMLU, TMMLU+, and Penguin_Table) please reference examples Evaluate API Models (MMLU, TMMLU+, and Penguin_Table) please check scripts/cal_likelihood_by_api.py Evaluate MTBench-tw please reference here.