Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bfloat16 serialization for tensors with zero elements #560

Merged
merged 2 commits into from
Mar 28, 2023

Conversation

borzunov
Copy link
Member

@borzunov borzunov commented Mar 28, 2023

Follow-up to #553.

Dummy tensors are widely used across Petals, but we get this with hivemind==1.1.6:

Mar 28 23:05:40.339 [WARN] [hivemind.p2p.p2p_daemon._process_stream:440] Handler failed with the exception:                                                                                           
Traceback (most recent call last):                                                                                                                                                     
  File "/home/jheuristic/anaconda3/envs/py38_petals_borzunov/lib/python3.8/site-packages/hivemind/p2p/p2p_daemon.py", line 431, in _process_stream                                       
    async for response in handler(_read_stream(), context):                                                                                                                               
  File "/home/jheuristic/anaconda3/envs/py38_petals_borzunov/lib/python3.8/site-packages/hivemind/p2p/p2p_daemon.py", line 529, in _stream_handler                                       
    async for item in output:                                                                                                                                                                
  File "/storage/hdd1/jheuristic/exp/decentralized/borzunov/petals/src/petals/server/handler.py", line 137, in rpc_inference                                                                              
    hidden_states, prompts, hypo_ids = map(deserialize_torch_tensor, request.tensors)                                                                                                                    
  File "/home/jheuristic/anaconda3/envs/py38_petals_borzunov/lib/python3.8/site-packages/hivemind/compression/serialization.py", line 47, in deserialize_torch_tensor                                      
    return compression.extract(serialized_tensor).requires_grad_(serialized_tensor.requires_grad)                                                                                                         
  File "/home/jheuristic/anaconda3/envs/py38_petals_borzunov/lib/python3.8/site-packages/hivemind/compression/base.py", line 108, in extract          
    if len(serialized_tensor.buffer) // shape.numel() == 4:  # legacy mode: convert to fp32                                                                             
ZeroDivisionError: integer division or modulo by zero  

This happens for the simplest inference queries.

@borzunov borzunov requested review from justheuristic and mryab March 28, 2023 23:27
@codecov
Copy link

codecov bot commented Mar 28, 2023

Codecov Report

Merging #560 (67704d7) into master (98531ce) will increase coverage by 0.04%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #560      +/-   ##
==========================================
+ Coverage   75.88%   75.92%   +0.04%     
==========================================
  Files          81       81              
  Lines        8008     8009       +1     
==========================================
+ Hits         6077     6081       +4     
+ Misses       1931     1928       -3     
Impacted Files Coverage Δ
hivemind/compression/base.py 94.52% <100.00%> (+0.07%) ⬆️

... and 3 files with indirect coverage changes

@borzunov borzunov merged commit 3164928 into master Mar 28, 2023
@borzunov borzunov deleted the fix-bfloat16-compression branch March 28, 2023 23:57
borzunov added a commit to bigscience-workshop/petals that referenced this pull request Mar 29, 2023
This PR fixes issues of #290:

- hivemind bfloat16 codec crashed on dummy tensors (with 0 elements), see learning-at-home/hivemind#560 (this PR makes Petals depend on the latest hivemind version from the repo, it's temporary)
- transformers version check mismatched with the version allowed in `setup.cfg`

Also:

- This PR enables 8-bit by default for TP. Even though TP in 8-bit may be slower, we currently prefer to host more blocks to increase the network's stability.
mryab pushed a commit that referenced this pull request Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants