Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: full gpu hybrid model #963

Merged
merged 16 commits into from
Jan 6, 2025

Conversation

andrei-stoian-zama
Copy link
Collaborator

@andrei-stoian-zama andrei-stoian-zama commented Dec 18, 2024

  • Adds support for GPU (and other devices) for the local computation of the hybrid model.
    • use torch.Tensors for the GLWE backend
    • keep the tensor device consistent
    • add a Torch quantizer
  • Memory optimization:
    • Removes the ONNX model for quantized models built for fully linear layers in the hybrid model
    • Remove calibration data once hybrid model calibration is done
  • Upgrades the use-case example CI to Ubuntu 22 (python 3.10)
  • Improves the LLAMA lora fine tuning use case (adds FHE execution)

Closes https://github.com/zama-ai/concrete-ml-internal/issues/4682

@cla-bot cla-bot bot added the cla-signed label Dec 18, 2024
Base automatically changed from llama_fine_tuning to main December 19, 2024 15:33
@andrei-stoian-zama andrei-stoian-zama force-pushed the chore/optimize_mem_and_runtime_fhe_disable branch from 43601e1 to 2a326f3 Compare December 31, 2024 09:42
@andrei-stoian-zama andrei-stoian-zama marked this pull request as ready for review January 3, 2025 20:06
@andrei-stoian-zama andrei-stoian-zama requested a review from a team as a code owner January 3, 2025 20:06
Copy link

github-actions bot commented Jan 4, 2025

⚠️ Known flaky tests have been rerun ⚠️

One or several tests initially failed but were identified as known flaky. tests. Therefore, they have been rerun and passed. See below for more details.

Failed tests details

Known flaky tests that initially failed:

  • tests/torch/test_compile_torch.py::test_compile_torch_or_onnx_conv_networks[False-True-CNN_conv1d-relu]

Copy link

github-actions bot commented Jan 4, 2025

Coverage passed ✅

Coverage details

---------- coverage: platform linux, python 3.8.18-final-0 -----------
Name    Stmts   Miss  Cover   Missing
-------------------------------------
TOTAL    8543      0   100%

63 files skipped due to complete coverage.

@@ -730,14 +730,18 @@ def _quantize_layers(self, *input_calibration_data: numpy.ndarray):
node_results[output_name] = node_output[0]
constants.add(output_name)

def quantize_module(self, *calibration_data: numpy.ndarray) -> QuantizedModule:
def quantize_module(
self, *calibration_data: numpy.ndarray, keep_onnx: Optional[bool] = True
Copy link
Contributor

@kcelia kcelia Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean keep_onnx: Optional[bool] = None ? or keep_onnx: bool = True

@@ -686,6 +691,8 @@ def quant(self, values: numpy.ndarray) -> numpy.ndarray:
assert self.offset is not None
assert self.scale is not None

assert dtype in (numpy.int64, numpy.int32, numpy.float32, numpy.float64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe:

valid_dtypes = (numpy.int64, numpy.int32, numpy.float32, numpy.float64)
assert dtype in valid_dtypes, f"Invalid dtype: `{dtype}`. Expected one of {valid_dtypes}."

"\u001b[0;31mNotImplementedError\u001b[0m: GLWE backend deployment is not yet supported"
]
}
],
Copy link
Contributor

@kcelia kcelia Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~~There are 2 errors in this notebook~
NotImplementedError: GLWE backend deployment is not yet supported
AssertionError: assert self.private_key is not None`

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed and re-ran this notebook in #969

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrei-stoian-zama andrei-stoian-zama merged commit b446cb2 into main Jan 6, 2025
20 of 21 checks passed
@andrei-stoian-zama andrei-stoian-zama deleted the chore/optimize_mem_and_runtime_fhe_disable branch January 6, 2025 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants