[WIP, don't merge] unity.cpp -> ggml master #719

cndn · 2024-01-31T03:46:15Z

This is a WIP PR to sync unity.cpp from seamless_communication (https://github.com/facebookresearch/seamless_communication/tree/main/ggml) to examples/ under ggml. Sharing for visibility and looking for early feedback from community & authors. Feel free to check README for usage.

Questions about ggml changes needed with this PR:
(1) Would you prefer a separate repo for unity.cpp (Like whisper.cpp / llama.cpp we could have a standalone (https://github.com/facebookresearch/seamless_communication/tree/main/ggml/unity.cpp), or checking into examples/ once it’s polished? Or both?

(2) We use kaldi-native-fbank for feature extraction and checked the whole library into ggml as examples/kaldi-native-fbank. Do you prefer a separate installation for this KNF lib? One caveat is with this lib sticking together I needed to update CMAKE_CXX_STANDARD to 14 from 11.

(3) We added several custom operators including batch_norm, glu, and convolution related ones. Realized convolution related ones already existed on master (but didn’t when we started). One TODO item is to merge with them, wondering if there’s ongoing effort on
(a) unifying depthwise conv & im2col ops to have one single ggml_conv_1d op with groups=1 or model_dim as argument
(b) supporting fp32 for im2col. Currently I ran into some issues when using im2col for our model which uses fp32 all the way down, so likely fp32 <-> fp16 cast related, still investigating.

(4) In order to convert fairseq2 checkpoints to ggml format, our script ggml_convert.py rely on ggml-python third party library https://github.com/abetlen/ggml-python (We have ggml.py copy in our folder). Just wondering if there’s a plan to add the python bindings to ggml repo, so we could make sure they are in sync.

Also any comment on the best path to integrate unity.cpp with awesome ggml would be appreciated, thanks in advance!

ggerganov

Hi and thank you for the interesting project!

The ggml examples need to be small and I think unity.cpp is not very suitable in this regard. Probably a separate repo would be better. I've been keeping whisper.cpp as an example, but it's a bit tedious to keep it in sync, so I'm already considering to remove it, as it does not bring much value
If this project is migrated to it's own repo, then I think having kaldi-native-fbank in the code base (or even as submodule) is fine. It's not very suitable for ggml example (see 1). Regarding the c++14 standard - I guess that's OK, but it mainly depends on what the unity.cpp repo wants to target. My personal opinion is that c++11 has all the good stuff and from then on it is mostly complications, but this is of course very subjective
Yes, reusing what is already available is recommended. There are things to improve like supporting F16/F32 data types where missing, implementing GPU kernels (optional), etc. so any progress in this regard is welcome. Make sure to understand how the test-backend-ops tool works and add necessary tests to cover the new functionality - it is very valuable
Adding Python bindings to ggml is not currently on the roadmap. There is actually an attempt in the examples: https://github.com/ggerganov/ggml/tree/master/examples/python

I'm not sure however what is it's state and if it is really usable for anything. If it is easy to update and make use of it, then we can try to support it officially, but would definitely need some help from the OSS community to do that.

I think you need the bindings only to get the ggml_type enum, correct?

Also any comment on the best path to integrate unity.cpp with awesome ggml would be appreciated, thanks in advance!

I would say that the best option is a dedicated unity.cpp repo with ggml as a submodule. I guess the stable-diffusion.cpp is one of the better examples of this. Adopting the ggml-backend API is recommended and will provide seamless GPU support in the future (given that all the GPU kernels become available)

Hope this is useful, let us know if you have further questions - would be happy to help!

Edit: didn't mean to hit "Approve" 🤷

ggerganov · 2024-02-02T13:23:05Z

include/ggml/ggml.h

+        struct ggml_tensor  * a,
+        struct ggml_tensor  * b,
+        int                   p0);
+
    GGML_API struct ggml_tensor * ggml_conv_1d(


There is an implementation available on master:

ggml/include/ggml/ggml.h

Lines 1499 to 1510 in 6b14d73

GGML_API struct ggml_tensor * ggml_conv_depthwise_2d(

struct ggml_context * ctx,

struct ggml_tensor * a,

struct ggml_tensor * b,

int s0,

int s1,

int p0,

int p1,

int d0,

int d1);

cndn · 2024-02-02T21:56:01Z

Thanks @ggerganov for detailed comments!
Regarding 1 and 2, makes sense, I will keep unity.cpp under seamless_communication for now. So keeping KNF library there would be easier without bloating ggml main. Feel free to import the folder to your side as a standalone repo, if you'd like.

For 3, thanks for the suggestion, will look at test-backend-ops and add tests for new ops.

For 4, The binding is also used in our custom scripts for testing test_unity_cpp.py (will also include in separate repo, now it lives under seamless_communication) and converting fairseq2 checkpoints to ggml format ggml_convert.py. And yes the latter only uses type enum.

bil-ash · 2024-04-08T00:56:25Z

@cndn Please add quantization ggml options to unity.cpp

cndn added 3 commits January 23, 2024 03:19

initial sync

eaa5710

fix initialization before padding

e394ec9

wip

05b2095

ggerganov approved these changes Feb 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP, don't merge] unity.cpp -> ggml master #719

[WIP, don't merge] unity.cpp -> ggml master #719

cndn commented Jan 31, 2024 •

edited

Loading

ggerganov left a comment •

edited

Loading

ggerganov Feb 2, 2024

cndn commented Feb 2, 2024 •

edited

Loading

bil-ash commented Apr 8, 2024


	GGML_API struct ggml_tensor * ggml_conv_depthwise_2d(
	struct ggml_context * ctx,
	struct ggml_tensor * a,
	struct ggml_tensor * b,
	int s0,
	int s1,
	int p0,
	int p1,
	int d0,
	int d1);

[WIP, don't merge] unity.cpp -> ggml master #719

Are you sure you want to change the base?

[WIP, don't merge] unity.cpp -> ggml master #719

Conversation

cndn commented Jan 31, 2024 • edited Loading

ggerganov left a comment • edited Loading

Choose a reason for hiding this comment

ggerganov Feb 2, 2024

Choose a reason for hiding this comment

cndn commented Feb 2, 2024 • edited Loading

bil-ash commented Apr 8, 2024

cndn commented Jan 31, 2024 •

edited

Loading

ggerganov left a comment •

edited

Loading

cndn commented Feb 2, 2024 •

edited

Loading