-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partial support of Apple M1/M2 (via CPU mode) #504
base: main
Are you sure you want to change the base?
Conversation
+1 |
llama/generation.py
Outdated
@@ -265,7 +273,7 @@ def chat_completion( | |||
f"{B_INST} {(dialog[-1]['content']).strip()} {E_INST}", | |||
bos=True, | |||
eos=False, | |||
) | |||
).to(default_device()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AttributeError on this line: 'list' object has no attribute 'to'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, resolved,
somehow this was only impacting example_chat_completion.py only so I missed it
however for example_chat_completion.py there are assertions that are failing, does not look like it has anything to do with my changes. I can make it work but this will require:
- removing assert
- explicitly adjust tensor shape casting
neither of these changes should be committed (but I still can if needed)
This is pretty cool, we should look to land this once it supports mps - cc @malfet who has been looking a lot at llm inference on M1 |
What is the expected performance (~ tokens per second) for the weakest option: Macbook Air 2020 8GB? It would be nice if there was a table with the different hardware and model sizes. |
return True | ||
|
||
|
||
def distrubuted_device(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
example of the run: