You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Excuse me, when executing cache-llama-activations.py in the deployment directory to generate activations.pickle, an assert (False) error is raised in the QuantK class's parallel_pack function in deployment/transformers/src/transformers/models/llama/modeling_llama.py file, with self.include_sparse being set to 0, as shown in the image. It seems that there is an issue with the workflow.
The quantizers.pickle file has been successfully generated.Should the instructions in the README file be adjusted in order to generate activations.pickle successfully?
The text was updated successfully, but these errors were encountered:
Excuse me, when executing cache-llama-activations.py in the deployment directory to generate activations.pickle, an assert (False) error is raised in the QuantK class's parallel_pack function in deployment/transformers/src/transformers/models/llama/modeling_llama.py file, with self.include_sparse being set to 0, as shown in the image. It seems that there is an issue with the workflow.
The quantizers.pickle file has been successfully generated.Should the instructions in the README file be adjusted in order to generate activations.pickle successfully?
The text was updated successfully, but these errors were encountered: