-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dynamic estimate of required memory usage #438
Conversation
c9aa526
to
660e1df
Compare
660e1df
to
636a954
Compare
f0e79f4
to
4e64e37
Compare
4e64e37
to
424281a
Compare
hold up, need to fix perplexity. update: still investigating. |
@Green-Sky UB is hard to fix, I really appreciate! I'll try this PR tomorrow. Before that, let me to make an immature suggestion: Think about the situation that new segmentation fault occur again, but still take time fix. |
it is only UB if you run without address sanitizer 😉 |
so, 32GiB are not enough to run perplexity (defaults) on 7B q4_0 . edit: with context 1024 edit: #407 changes how this works |
3c31292
to
5dd94f7
Compare
for some reason @ggerganov pushed 4870e45 👀 |
Unfortunately, this still doesn't fix the memory allocation issues :( From what I can tell, it pretty much wraps stuff in vectors and adds an assertion to force the code to fail rather than segfaulting. |
@ggerganov promised an memory overhaul here #407 (comment) so i am closing this pr. |
Runs smoothly, thanks! |
officially replaced by #473 |
uses observations made in #213 and replaces it.
fixes
ggml_new_tensor_impl: not enough space in the context's memory pool
and resulting Segfaults.this is still as much of a hack as it was before, but this time it is working.
this could potentially fix a bunch of issues. ( fixes #153 )