Flux - soft inpainting via differential diffusion #9268

ryanlyn · 2024-08-25T10:23:10Z

What does this PR do?

Adds a new community pipeline that brings Differential Diffusion to the Flux.1 family of models (currently Flux.1-schnell and Flux.1-dev).

Builds right on top of the fantastic work of #9135. My additions pertain only to the various diff diff annotations.

Things to do:

implementation
documentation

Testing

Flux.1-schnell

The schnell model can be used following this example:

image = preprocess_image(load_image(
        "https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/differential/20240329211129_4024911930.png?download=true"
    ))

mask = preprocess_map(load_image(
    "https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/differential/gradient_mask.png?download=true"
))

pipe = FluxDifferentialImg2ImgPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-schnell", 
    torch_dtype=torch.bfloat16, 
    device_map="balanced",
)

out = pipe(
    prompt="a red strawberry, black background",
    guidance_scale=0.0, 
    num_inference_steps=12,
    image=image,
    mask_image=mask,
    strength=0.88,
).images[0]

out.show()

A red strawberry, black background (12 steps at 0.9 strength):

Blending with the schnell model is hard to get right

Flux.1-dev

I expect the Dev model to be used via this if there is sufficient vram:

image = load_image(
    "https://github.com/exx8/differential-diffusion/blob/main/assets/input.jpg?raw=true",
)

mask = load_image(
    "https://github.com/exx8/differential-diffusion/blob/main/assets/map.jpg?raw=true",
)

pipe = FluxDifferentialImg2ImgPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload()

out = pipe(
    prompt="...",
    num_inference_steps=20,
    guidance_scale=7.5,
    image=image,
    mask_image=mask,
    strength=1.0,
).images[0]

out.show()

My tests/usages, however, were all done on the FP8 quantized version (https://huggingface.co/Kijai/flux-fp8/blob/main/flux1-dev-fp8.safetensors):

A green pear, black background (50 steps at 1.0 strength):

painting of a mountain landscape with a meadow and a forest, meadow background, anime countryside landscape, anime nature wallpap, anime landscape wallpaper, studio ghibli landscape, anime landscape, mountain behind meadow, anime background art, studio ghibli environment, background of flowery hill, anime beautiful peace scene, forrest background, anime scenery, landscape background, background art, anime scenery concept art (20 steps at 1.0 strength):

Before submitting

Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.

Who can review?

Skquark · 2024-09-11T09:47:58Z

Question, would this work as an alternative to FluxInpaint to be able to do outpainting with the inner mask blurred, or do I wait for FluxDifferentialInpaintPipeline? I've been trying to add Flux in my Infinite Zoom implementation, and while I got it working, it's just not blending the masked area well and each outstep is framed. Works nicely with SD 1.5 Inpainting. I've tried to blur the mask_image black/white and didn't help. I also notice with this Differential mask_image that the masked area is black instead of white as it is in the normal Inpainting, correct? Can be tested on my app at DiffusionDeluxe.com if you're curious.

yiyixuxu · 2024-09-12T02:29:48Z

@asomoza can you take a look and help merge this?

HuggingFaceDocBuilderDev · 2024-09-12T02:35:54Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

asomoza · 2024-09-16T14:40:02Z

HI @ryanlyn, sorry for this late review, did you update the pipeline with the latest changes from the img2img pipeline? This work ok and we don't really enforce too much of the guidelines here, but it will be nice if the copied lines are the same from the img2img.

Let's merge this soon!

jffu · 2024-10-11T09:13:57Z

Thanks for great work! Looks like It fails when batch_size > 1?

Traceback (most recent call last):
  File "test_diff_diff_flux.py", line 41, in <module>
    image = pipeline(
  File "/home/admin/miniconda3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/admin/workspace/aop_lab/app_source/pipeline_flux_differential_img2img.py", line 855, in __call__
    latents, noise, original_image_latents, latent_image_ids = self.prepare_latents(
  File "/home/admin/workspace/aop_lab/app_source/pipeline_flux_differential_img2img.py", line 566, in prepare_latents
    image_latents = self._pack_latents(image_latents, batch_size, num_channels_latents, height, width)
  File "/home/admin/workspace/aop_lab/app_source/pipeline_flux_differential_img2img.py", line 504, in _pack_latents
    latents = latents.view(batch_size, num_channels_latents, height // 2, 2, width // 2, 2)
RuntimeError: shape '[2, 16, 64, 2, 64, 2]' is invalid for input of size 262144

asomoza · 2024-10-11T17:19:57Z

giving a gentle ping to @ryanlyn so we can merge this pipeline. @jffu that error it's probably because it doesn't have the latest updates to the base pipeline unless that also happens to the base img2img pipeline.

jffu · 2024-10-12T01:56:32Z

giving a gentle ping to @ryanlyn so we can merge this pipeline. @jffu that error it's probably because it doesn't have the latest updates to the base pipeline unless that also happens to the base img2img pipeline.

@asomoza yes, merging the latest updates can fix this.
This is the modified section, just in case someone else needs it.

@@ -554,6 +554,17 @@ class FluxDifferentialImg2ImgPipeline(DiffusionPipeline, FluxLoraLoaderMixin):
         else:
             image_latents = latents
 
+        if batch_size > image_latents.shape[0] and batch_size % image_latents.shape[0] == 0:
+            # expand init_latents for batch_size
+            additional_image_per_prompt = batch_size // image_latents.shape[0]
+            image_latents = torch.cat([image_latents] * additional_image_per_prompt, dim=0)
+        elif batch_size > image_latents.shape[0] and batch_size % image_latents.shape[0] != 0:
+            raise ValueError(
+                f"Cannot duplicate `image` of batch size {image_latents.shape[0]} to {batch_size} text prompts."
+            )
+        else:
+            image_latents = torch.cat([image_latents], dim=0)
+
         noise = randn_tensor(shape, generator=generator, device=device, dtype=dtype)
         latents = noise if is_strength_max else self.scheduler.scale_noise(image_latents, timestep, noise)
         noise = self._pack_latents(noise, batch_size, num_channels_latents, height, width)
@@ -882,7 +893,7 @@ class FluxDifferentialImg2ImgPipeline(DiffusionPipeline, FluxLoraLoaderMixin):
         mask_thresholds = mask_thresholds.unsqueeze(1).unsqueeze(1).to(device)
         masks = (original_mask > mask_thresholds)
         masks = self._pack_latents(
-            masks.repeat(num_channels_latents, 1, 1, 1).permute(1, 0, 2, 3),
+            masks.repeat(num_channels_latents // num_images_per_prompt, 1, 1, 1).permute(1, 0, 2, 3),
             len(mask_thresholds),
             num_channels_latents,
             2 * (int(height) // self.vae_scale_factor),

ryanlyn · 2024-10-13T06:50:29Z

Sorry about the wait @asomoza 🙏 and thank you for finding that issue @jffu , I've added in the latest changes from the base pipeline and it addressed batching and simplified the mask arrangement

asomoza · 2024-10-14T13:07:34Z

Thanks a lot, also as a reference, this seems to be a really good alternative for Flux inpainting, haven't tested the controlnet yet, but the quality of diff-diff seems decent.

original	result	result

Clement-Lelievre · 2024-10-16T16:47:13Z

thanks for this work!

What are the use cases for using this pipe with a strength param other than 1?
Correct me if I'm wrong, but I feel doing so would somewhat go against the inference mechanism of this pipeline.

For example, a fully dark pixel in the change map would no longer be totally overriden in the output image's corresponding pixel.

asomoza · 2024-10-16T18:37:26Z

Probably @exx8 can give you a more detailed answer, but for me, it's the same as when you use img2img or inpainting, if you use a strength of 1.0 you are ignoring whatever was before with the difference that the soft mask attenuates the difference between the masked part and the not masked part if you use a gradient.

If you use a lower strength, the generation tries to adapt more to what was before and also diff-diff makes it that it merges better with the old part. This is IMO what makes great diff-diff, you can use it as inpainting or just to gradually change some parts of the image like in the demo or like I did here .

In the same post you can also see what I did with the crow example, using 0.8 works great in that image because I didn't want to completely override what was before, so it took part of the shape of the previous bird.

Apart from the strength you can also play with the brightness of the mask, IMO people don't really understand the versatility of what you can do with diff-diff, partially because the UIs don't have the kind of tools to work with masks and gradients.

* Flux - soft inpainting via differential diffusion * . * track changes to FluxInpaintPipeline * make mask arrangement simplier * make style --------- Co-authored-by: YiYi Xu <[email protected]> Co-authored-by: Álvaro Somoza <[email protected]> Co-authored-by: asomoza <[email protected]>

ryanlyn added 2 commits August 25, 2024 19:44

Flux - soft inpainting via differential diffusion

b047585

.

237bbc8

tolgacangoz mentioned this pull request Aug 29, 2024

Add Flux inpainting and Flux Img2Img #9135

Merged

5 tasks

exx8 mentioned this pull request Sep 5, 2024

Can this to adapted to Flux? exx8/differential-diffusion#31

Closed

Merge branch 'main' into ryan--flux-diff-diff

eaab730

yiyixuxu requested a review from asomoza September 12, 2024 02:29

asomoza mentioned this pull request Oct 11, 2024

Flux Outpainting #9188

Closed

ryanlyn added 3 commits October 13, 2024 16:15

merge main

901e9ed

track changes to FluxInpaintPipeline

671c30e

make mask arrangement simplier

5bfbe17

asomoza and others added 2 commits October 14, 2024 09:19

Merge branch 'main' into ryan--flux-diff-diff

335f4d4

make style

67360cf

asomoza approved these changes Oct 14, 2024

View reviewed changes

asomoza merged commit 68d16f7 into huggingface:main Oct 14, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flux - soft inpainting via differential diffusion #9268

Flux - soft inpainting via differential diffusion #9268

ryanlyn commented Aug 25, 2024 •

edited

Loading

Skquark commented Sep 11, 2024

yiyixuxu commented Sep 12, 2024

HuggingFaceDocBuilderDev commented Sep 12, 2024

asomoza commented Sep 16, 2024

jffu commented Oct 11, 2024

asomoza commented Oct 11, 2024

jffu commented Oct 12, 2024

ryanlyn commented Oct 13, 2024

asomoza commented Oct 14, 2024

Clement-Lelievre commented Oct 16, 2024 •

edited

Loading

asomoza commented Oct 16, 2024 •

edited

Loading

Flux - soft inpainting via differential diffusion #9268

Flux - soft inpainting via differential diffusion #9268

Conversation

ryanlyn commented Aug 25, 2024 • edited Loading

What does this PR do?

Testing

Flux.1-schnell

Flux.1-dev

Before submitting

Who can review?

Skquark commented Sep 11, 2024

yiyixuxu commented Sep 12, 2024

HuggingFaceDocBuilderDev commented Sep 12, 2024

asomoza commented Sep 16, 2024

jffu commented Oct 11, 2024

asomoza commented Oct 11, 2024

jffu commented Oct 12, 2024

ryanlyn commented Oct 13, 2024

asomoza commented Oct 14, 2024

Clement-Lelievre commented Oct 16, 2024 • edited Loading

asomoza commented Oct 16, 2024 • edited Loading

ryanlyn commented Aug 25, 2024 •

edited

Loading

Clement-Lelievre commented Oct 16, 2024 •

edited

Loading

asomoza commented Oct 16, 2024 •

edited

Loading