Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SD3/3.5 issues with new releases #479

Closed
stduhpf opened this issue Nov 23, 2024 · 11 comments
Closed

SD3/3.5 issues with new releases #479

stduhpf opened this issue Nov 23, 2024 · 11 comments

Comments

@stduhpf
Copy link
Contributor

stduhpf commented Nov 23, 2024

Since the recent update , SD 3.5 generated images look a bit burned out compared to images generated with the exact same parameters with the previous versions. This happens both with CPU and Vulkan backends (can't test the others). Looks like a VAE problem to me, since the image composition is almost exactly the same.

Also sd3 images are pitch black when using to the more recent version of GGML on Vulkan (on CPU it is just burned in like with sd3.5).

model master (c3eeb66) before ggml update (b5f4932) ac54e00 with #451 and #397 merged
sd 3.5 medium output output skiplayercfg
sd 3.5 large turbo output output output
sd 3 medium output output output
@stduhpf
Copy link
Contributor Author

stduhpf commented Nov 23, 2024

Git bisect result:

1c168d9 is the first bad commit
commit 1c168d9 (HEAD, tag: master-1c168d9)
Author: Erik Scholz [email protected]
Date: Sat Nov 23 05:39:08 2024 +0100

fix: repair flash attention support (#386)

* repair flash attention in _ext
this does not fix the currently broken fa behind the define, which is only used by VAE

Co-authored-by: FSSRepo <[email protected]>

* make flash attention in the diffusion model a runtime flag
no support for sd3 or video

* remove old flash attention option and switch vae over to attn_ext

* update docs

* format code

---------

Co-authored-by: FSSRepo <[email protected]>
Co-authored-by: leejet <[email protected]>

CMakeLists.txt | 6 --
README.md | 21 +++-
clip.hpp | 18 ++--
common.hpp | 23 ++--
conditioner.hpp | 19 ++--
diffusion_model.hpp | 12 ++-
examples/cli/main.cpp | 10 +-
flux.hpp | 49 ++++++---
ggml_extend.hpp | 77 ++++++-------
model.cpp | 28 +++--
model.h | 2 +-
pmid.hpp | 293 +++++++++++++++++++++++---------------------------
stable-diffusion.cpp | 61 ++++++-----
stable-diffusion.h | 3 +-
unet.hpp | 11 +-
util.cpp | 5 +-
vae.hpp | 10 +-
17 files changed, 334 insertions(+), 314 deletions(-)

@Green-Sky Do you have a clue what change could be the root cause? (my tests were with fa disabled)

@Green-Sky
Copy link
Contributor

Damn. Yea I refactored the attention. However, the non-flash-attn code should be equivalent. See the changes in ggml_extend.hpp .
Can you play around with that? I don't have sd3 handy rn.

@Green-Sky
Copy link
Contributor

@leejet
Copy link
Owner

leejet commented Nov 24, 2024

I think it's the issue with ggml_nn_attention_ext used by VAE, I've reverted the relevant changes, you can try again.

@Green-Sky
Copy link
Contributor

Hm, I wonder what went wrong there. I tried to unify the attention, and it worked for every model I tested...

@stduhpf
Copy link
Contributor Author

stduhpf commented Nov 24, 2024

@leejet Yes, 4570715 fixed the burn in issue. sd3 is still black on vulkan, but that probably comes from ggml upstream.

@stduhpf stduhpf closed this as completed Nov 24, 2024
@stduhpf
Copy link
Contributor Author

stduhpf commented Nov 24, 2024

@Green-Sky I don't exactly know what I'm doing, but I just managed to fix it on on my end by just setting skip_reshapeto false when calling ggml_nn_attention_ext()in vae. Was it set to true for a reason?

diff --git a/vae.hpp b/vae.hpp
index 0c7d84f..622a1d9 100644
--- a/vae.hpp
+++ b/vae.hpp
@@ -104,7 +104,7 @@ public:
         v      = ggml_reshape_3d(ctx, v, c, h * w, n);              // [N, h * w, in_channels]

         // h_ = ggml_nn_attention(ctx, q, k, v, false);  // [N, h * w, in_channels]
-        h_ = ggml_nn_attention_ext(ctx, q, k, v, 1, nullptr, false, true, false);
+        h_ = ggml_nn_attention_ext(ctx, q, k, v, 1, nullptr, false, false, false);

@Green-Sky
Copy link
Contributor

@Green-Sky I don't exactly know what I'm doing, but I just managed to fix it on on my end by just setting skip_reshapeto false when calling ggml_nn_attention_ext()in vae.

Oh, thats nice, did you test other models too?

Was it set to true for a reason?

I honestly dont remember. I will do more test later.

@stduhpf
Copy link
Contributor Author

stduhpf commented Nov 24, 2024

Oh, thats nice, did you test other models too?

Flux Lite and SDXL seem to work, so i'd assume everything should be fine.

@stduhpf
Copy link
Contributor Author

stduhpf commented Nov 24, 2024

Also the black image with SD3 on Vulkan isn't a VAE issue. I tried with #454, and the previews were also black. I'll try to see if I can figure out what is going on here.

@stduhpf
Copy link
Contributor Author

stduhpf commented Nov 25, 2024

Ok, that is kinda funny. The SD3 issue is actually related to another issue I already reported in llama.cpp (ggml-org/llama.cpp#10434), and ggml-org/llama.cpp#10437 fixed it. Patching soft_max.comp the same way fixes it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants