Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP, don't merge] unity.cpp -> ggml master #719

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

cndn
Copy link

@cndn cndn commented Jan 31, 2024

This is a WIP PR to sync unity.cpp from seamless_communication (https://github.com/facebookresearch/seamless_communication/tree/main/ggml) to examples/ under ggml. Sharing for visibility and looking for early feedback from community & authors. Feel free to check README for usage.

Questions about ggml changes needed with this PR:
(1) Would you prefer a separate repo for unity.cpp (Like whisper.cpp / llama.cpp we could have a standalone (https://github.com/facebookresearch/seamless_communication/tree/main/ggml/unity.cpp), or checking into examples/ once it’s polished? Or both?

(2) We use kaldi-native-fbank for feature extraction and checked the whole library into ggml as examples/kaldi-native-fbank. Do you prefer a separate installation for this KNF lib? One caveat is with this lib sticking together I needed to update CMAKE_CXX_STANDARD to 14 from 11.

(3) We added several custom operators including batch_norm, glu, and convolution related ones. Realized convolution related ones already existed on master (but didn’t when we started). One TODO item is to merge with them, wondering if there’s ongoing effort on
(a) unifying depthwise conv & im2col ops to have one single ggml_conv_1d op with groups=1 or model_dim as argument
(b) supporting fp32 for im2col. Currently I ran into some issues when using im2col for our model which uses fp32 all the way down, so likely fp32 <-> fp16 cast related, still investigating.

(4) In order to convert fairseq2 checkpoints to ggml format, our script ggml_convert.py rely on ggml-python third party library https://github.com/abetlen/ggml-python (We have ggml.py copy in our folder). Just wondering if there’s a plan to add the python bindings to ggml repo, so we could make sure they are in sync.

Also any comment on the best path to integrate unity.cpp with awesome ggml would be appreciated, thanks in advance!

Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi and thank you for the interesting project!

  1. The ggml examples need to be small and I think unity.cpp is not very suitable in this regard. Probably a separate repo would be better. I've been keeping whisper.cpp as an example, but it's a bit tedious to keep it in sync, so I'm already considering to remove it, as it does not bring much value

  2. If this project is migrated to it's own repo, then I think having kaldi-native-fbank in the code base (or even as submodule) is fine. It's not very suitable for ggml example (see 1). Regarding the c++14 standard - I guess that's OK, but it mainly depends on what the unity.cpp repo wants to target. My personal opinion is that c++11 has all the good stuff and from then on it is mostly complications, but this is of course very subjective

  3. Yes, reusing what is already available is recommended. There are things to improve like supporting F16/F32 data types where missing, implementing GPU kernels (optional), etc. so any progress in this regard is welcome. Make sure to understand how the test-backend-ops tool works and add necessary tests to cover the new functionality - it is very valuable

  4. Adding Python bindings to ggml is not currently on the roadmap. There is actually an attempt in the examples: https://github.com/ggerganov/ggml/tree/master/examples/python

    I'm not sure however what is it's state and if it is really usable for anything. If it is easy to update and make use of it, then we can try to support it officially, but would definitely need some help from the OSS community to do that.

    I think you need the bindings only to get the ggml_type enum, correct?

Also any comment on the best path to integrate unity.cpp with awesome ggml would be appreciated, thanks in advance!

I would say that the best option is a dedicated unity.cpp repo with ggml as a submodule. I guess the stable-diffusion.cpp is one of the better examples of this. Adopting the ggml-backend API is recommended and will provide seamless GPU support in the future (given that all the GPU kernels become available)

Hope this is useful, let us know if you have further questions - would be happy to help!

Edit: didn't mean to hit "Approve" 🤷

struct ggml_tensor * a,
struct ggml_tensor * b,
int p0);

GGML_API struct ggml_tensor * ggml_conv_1d(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an implementation available on master:

ggml/include/ggml/ggml.h

Lines 1499 to 1510 in 6b14d73

GGML_API struct ggml_tensor * ggml_conv_depthwise_2d(
struct ggml_context * ctx,
struct ggml_tensor * a,
struct ggml_tensor * b,
int s0,
int s1,
int p0,
int p1,
int d0,
int d1);

@cndn
Copy link
Author

cndn commented Feb 2, 2024

Thanks @ggerganov for detailed comments!
Regarding 1 and 2, makes sense, I will keep unity.cpp under seamless_communication for now. So keeping KNF library there would be easier without bloating ggml main. Feel free to import the folder to your side as a standalone repo, if you'd like.

For 3, thanks for the suggestion, will look at test-backend-ops and add tests for new ops.

For 4, The binding is also used in our custom scripts for testing test_unity_cpp.py (will also include in separate repo, now it lives under seamless_communication) and converting fairseq2 checkpoints to ggml format ggml_convert.py. And yes the latter only uses type enum.

@bil-ash
Copy link

bil-ash commented Apr 8, 2024

@cndn Please add quantization ggml options to unity.cpp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants