Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamic estimate of required memory usage #438

Closed
wants to merge 3 commits into from

Conversation

Green-Sky
Copy link
Collaborator

@Green-Sky Green-Sky commented Mar 23, 2023

uses observations made in #213 and replaces it.

fixes ggml_new_tensor_impl: not enough space in the context's memory pool and resulting Segfaults.

this is still as much of a hack as it was before, but this time it is working.

this could potentially fix a bunch of issues. ( fixes #153 )

@Green-Sky Green-Sky changed the title dynamic estimate of required memory usage WIP: dynamic estimate of required memory usage Mar 23, 2023
@Green-Sky Green-Sky marked this pull request as ready for review March 23, 2023 18:37
@Green-Sky Green-Sky force-pushed the memory_investigation branch from c9aa526 to 660e1df Compare March 23, 2023 18:41
@Green-Sky Green-Sky changed the title WIP: dynamic estimate of required memory usage dynamic estimate of required memory usage Mar 23, 2023
@Green-Sky Green-Sky force-pushed the memory_investigation branch from 660e1df to 636a954 Compare March 23, 2023 18:44
@Green-Sky Green-Sky requested review from anzz1, sw and antimatter15 March 23, 2023 18:45
@Green-Sky Green-Sky requested a review from ggerganov March 23, 2023 18:53
llama.cpp Outdated Show resolved Hide resolved
@Green-Sky Green-Sky force-pushed the memory_investigation branch 2 times, most recently from f0e79f4 to 4e64e37 Compare March 23, 2023 19:00
@Green-Sky Green-Sky force-pushed the memory_investigation branch from 4e64e37 to 424281a Compare March 23, 2023 19:13
@Green-Sky
Copy link
Collaborator Author

Green-Sky commented Mar 23, 2023

hold up, need to fix perplexity.

update: still investigating.

@Green-Sky Green-Sky marked this pull request as draft March 23, 2023 19:32
@mqy
Copy link
Contributor

mqy commented Mar 23, 2023

@Green-Sky UB is hard to fix, I really appreciate!

I'll try this PR tomorrow. Before that, let me to make an immature suggestion:

Think about the situation that new segmentation fault occur again, but still take time fix.
Can we add an experimental cli parameter (or env variable), to allow user configure the max memory pool . According to his/her available RAM size (e.g. at least 1 GB). This benefit users who has big RAM and eager to evaluate.

@Green-Sky
Copy link
Collaborator Author

UB is hard to fix, I really appreciate!

it is only UB if you run without address sanitizer 😉

@Green-Sky
Copy link
Collaborator Author

Green-Sky commented Mar 23, 2023

==1743825==ERROR: AddressSanitizer: allocator is out of memory trying to allocate 0x1000000000 bytes
    #0 0x7f6d6c849808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144
    #1 0x561e0c966487 in llama_eval_internal /home/green/workspace/github/llama.cpp/llama.cpp:636
    #2 0x561e0c9682bb in llama_eval /home/green/workspace/github/llama.cpp/llama.cpp:1460
    #3 0x561e0c879365 in main /home/green/workspace/github/llama.cpp/main.cpp:229
    #4 0x7f6d6b890082 in __libc_start_main ../csu/libc-start.c:308

==1743825==HINT: if you don't care about these errors you may set allocator_may_return_null=1
SUMMARY: AddressSanitizer: out-of-memory ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144 in __interceptor_malloc
==1743825==ABORTING

so, 32GiB are not enough to run perplexity (defaults) on 7B q4_0 . edit: with context 1024
and 64GiB fail to allocate on my machine ...

edit: #407 changes how this works

@Green-Sky
Copy link
Collaborator Author

I decided to wait on #439 and maybe #407, so i don't break perplexity.

@Green-Sky Green-Sky force-pushed the memory_investigation branch from 3c31292 to 5dd94f7 Compare March 23, 2023 20:46
@Green-Sky
Copy link
Collaborator Author

for some reason @ggerganov pushed 4870e45 👀

@PerthGoat
Copy link

PerthGoat commented Mar 24, 2023

for some reason @ggerganov pushed 4870e45 👀

Unfortunately, this still doesn't fix the memory allocation issues :(

From what I can tell, it pretty much wraps stuff in vectors and adds an assertion to force the code to fail rather than segfaulting.

@Green-Sky
Copy link
Collaborator Author

Green-Sky commented Mar 24, 2023

@ggerganov promised an memory overhaul here #407 (comment)

so i am closing this pr.

@wafflecomposite
Copy link

Runs smoothly, thanks!

@Green-Sky
Copy link
Collaborator Author

officially replaced by #473

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ggml_new_tensor_impl: not enough space in the context's memory pool (needed 717778556, available 454395136)
5 participants