Performance degrading over time #832

vashat · 2023-04-07T12:48:25Z

Expected Behavior

When running this command:

./main -i --interactive-first -r "### Human:" --temp 0 -c 2048 -n -1 --ignore-eos --repeat_penalty 1.2 --threads 4 --instruct -m models/ggml-vicuna-13b-4bit.bin

I expect the performance to be the same over time when the model is answering my questions.

Current Behavior

The performance is good in the begining, answers are written out fast, 4 cpu cores are fully utilized. But over time speed degrades until it slows down to a word every 30 seconds and cpu cores are just idling.

Environment and Context

Apple M1 Mac Mini 16GB RAM. Ventura 13.3.

Python 3.8.13
GNU Make 3.81
Apple clang version 14.0.0 (clang-1400.0.29.202)

numpy                         1.23.4
rotary-embedding-torch        0.2.1
sentencepiece                 0.1.97
torch                         2.1.0.dev20230307
torchaudio                    2.0.0.dev20230307
torchvision                   0.15.0.dev20230307

Steps to Reproduce

Ask questions for a while. The speed should degrade after about 10 questions that require longer answers.

The text was updated successfully, but these errors were encountered:

vashat · 2023-04-07T13:13:10Z

Found the solution here: #767 . I needed to add the --mlock parameter.

vashat changed the title ~~[User] Insert summary of your issue or enhancement..~~ Performance degrading over time Apr 7, 2023

vashat closed this as completed Apr 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance degrading over time #832

Performance degrading over time #832

vashat commented Apr 7, 2023

vashat commented Apr 7, 2023

Performance degrading over time #832

Performance degrading over time #832

Comments

vashat commented Apr 7, 2023

Expected Behavior

Current Behavior

Environment and Context

Steps to Reproduce

vashat commented Apr 7, 2023