[Launch Blocker] The default --quantize config/data/desktop.json
is slow with eager, compile and aoti
#661
Labels
--quantize config/data/desktop.json
is slow with eager, compile and aoti
#661
One of the action items of this issue #621 was to have a default
desktop.json
quantization config. Since that was added, I tried running using the said config. It was slow.Average tokens/sec: 0.67
Average tokens/sec: 0.60
Average tokens/sec: (pending - It is currently taking a long time for the first byte)
The default
desktop.json
setting should be performant. An action item is to either change the desktop.json to have the right config that is fast or make the execution to be performant.Setup:
git commit: 4a83474
python version: 3.10.0
macbook pro M1
Internal Task: T187941181
The text was updated successfully, but these errors were encountered: