Skip to content

Commit

Permalink
Add inference support for Macbook silicon chip
Browse files Browse the repository at this point in the history
Signed-off-by: Benjamin Huo <[email protected]>
  • Loading branch information
benjaminhuo committed Jul 30, 2023
1 parent de5dfe2 commit 4346157
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 0 deletions.
2 changes: 2 additions & 0 deletions inference/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@ For the falcon-7b model, you can use the following command:
python3 serve/gorilla_falcon_cli.py --model-path path/to/gorilla-falcon-7b-hf-v0
```

> Add `--device mps` if you're running on your MacBook with silicon chip
### [Optional] Batch Inference on a Prompt File

After downloading the model, you need to make a jsonl file containing all the question you want to inference through Gorilla. Here is [one example](https://github.com/ShishirPatil/gorilla/blob/main/inference/example_questions/example_questions.jsonl):
Expand Down
2 changes: 2 additions & 0 deletions inference/serve/gorilla_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,8 @@ def load_model(
}
else:
kwargs["max_memory"] = {i: max_gpu_memory for i in range(num_gpus)}
elif device == "mps":
kwargs = {"torch_dtype": torch.float16}
else:
raise ValueError(f"Invalid device: {device}")

Expand Down

0 comments on commit 4346157

Please sign in to comment.