killed by os when running mac m3max and 128G Mem #277

yangjiandan · 2024-03-22T15:34:19Z

dan@MacBook-Pro grok-1 % python3.11 run.py
INFO:jax._src.xla_bridge:Unable to initialize backend 'cuda':
INFO:jax._src.xla_bridge:Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
INFO:jax._src.xla_bridge:Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: dlopen(libtpu.so, 0x0001): tried: 'libtpu.so' (no such file), '/System/Volumes/Preboot/Cryptexes/OSlibtpu.so' (no such file), '/opt/homebrew/lib/libtpu.so' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/lib/libtpu.so' (no such file), '/usr/lib/libtpu.so' (no such file, not in dyld cache), 'libtpu.so' (no such file), '/usr/local/lib/libtpu.so' (no such file), '/usr/lib/libtpu.so' (no such file, not in dyld cache)
INFO:rank:Initializing mesh for self.local_mesh_config=(1, 1) self.between_hosts_config=(1, 1)...
INFO:rank:Detected 1 devices in mesh
INFO:rank:partition rules: <bound method LanguageModelConfig.partition_rules of LanguageModelConfig(model=TransformerConfig(emb_size=6144, key_size=128, num_q_heads=48, num_kv_heads=8, num_layers=64, vocab_size=131072, widening_factor=8, attn_output_multiplier=0.08838834764831845, name=None, num_experts=8, capacity_factor=1.0, num_selected_experts=2, init_scale=1.0, shard_activations=True, data_axis='data', model_axis='model'), vocab_size=131072, pad_token=0, eos_token=2, sequence_len=8192, model_size=6144, embedding_init_scale=1.0, embedding_multiplier_scale=78.38367176906169, output_multiplier_scale=0.5773502691896257, name=None, fprop_dtype=<class 'jax.numpy.bfloat16'>, model_type=None, init_scale_override=None, shard_embeddings=True)>
INFO:rank:(1, 256, 6144)
INFO:rank:(1, 256, 131072)
INFO:rank:State sharding type: <class 'model.TrainingState'>
INFO:rank:(1, 256, 6144)
INFO:rank:(1, 256, 131072)
INFO:rank:Loading checkpoint at ./checkpoints/ckpt-0
zsh: killed python3.11 run.py

yangjiandan · 2024-03-22T15:36:56Z

davidearlyoung · 2024-03-23T21:50:01Z

Likely an OOM (out of memory) situation. Which is expected with a model that is larger then 128 GB with the raw weights. (300GB + for the full model) You likely will need to wait to run the whole model on a CPU with 128 GB of memory. 4-bit quantitation may be able to get the model down to the size of 96 GB with some quality loss in the output due to quantization effects. That is just speculation right now in regards of 4-bit quant weight size. #42

rankaiyx · 2024-03-28T01:57:13Z

https://huggingface.co/Arki05/Grok-1-GGUF
The actual measurement of the Q3_XS quantization model only needs 124G RAM.
The reasoning speed is similar to that of Q4 quantized miqu-70B.

rankaiyx · 2024-03-28T01:59:11Z

MiB Mem : 257617.7 total, 132751.1 free, 2548.8 used, 124396.7 buff/cache

llama_print_timings: load time = 87224.03 ms
llama_print_timings: sample time = 59.83 ms / 500 runs ( 0.12 ms per token, 8357.71 tokens per second)
llama_print_timings: prompt eval time = 7195.80 ms / 13 tokens ( 553.52 ms per token, 1.81 tokens per second)
llama_print_timings: eval time = 359604.96 ms / 499 runs ( 720.65 ms per token, 1.39 tokens per second)
llama_print_timings: total time = 367566.03 ms / 512 tokens

yangjiandan · 2024-03-29T01:23:57Z

@rankaiyx Thanks very much for your answer, I will try it later.

yangjiandan changed the title ~~killed by os when running mac m3max and 128G~~ killed by os when running mac m3max and 128G Mem Mar 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

killed by os when running mac m3max and 128G Mem #277

killed by os when running mac m3max and 128G Mem #277

yangjiandan commented Mar 22, 2024 •

edited

Loading

yangjiandan commented Mar 22, 2024

davidearlyoung commented Mar 23, 2024

rankaiyx commented Mar 28, 2024

rankaiyx commented Mar 28, 2024

yangjiandan commented Mar 29, 2024

killed by os when running mac m3max and 128G Mem #277

killed by os when running mac m3max and 128G Mem #277

Comments

yangjiandan commented Mar 22, 2024 • edited Loading

yangjiandan commented Mar 22, 2024

davidearlyoung commented Mar 23, 2024

rankaiyx commented Mar 28, 2024

rankaiyx commented Mar 28, 2024

yangjiandan commented Mar 29, 2024

yangjiandan commented Mar 22, 2024 •

edited

Loading