Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cog] some minor fixes and nits #9466

Merged
merged 8 commits into from
Sep 23, 2024
Merged

[Cog] some minor fixes and nits #9466

merged 8 commits into from
Sep 23, 2024

Conversation

sayakpaul
Copy link
Member

@sayakpaul sayakpaul commented Sep 19, 2024

What does this PR do?

  • Cleans up check_inputs()
  • Moves the generator check in prepare_latents() at the beginning of the method so that we can error as early as possible.
  • Documentation nits.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@a-r-r-o-w a-r-r-o-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops 😬 So, we don't have a test that passes both prompt embeds and negative prompt embeds? I believe that would have caught this, no? Surprising and very lucky that it was working for the normal prompt inputs so far.
Apologies for missing this on my part

@sayakpaul sayakpaul removed the request for review from yiyixuxu September 19, 2024 04:37
@sayakpaul sayakpaul marked this pull request as draft September 19, 2024 04:37
@sayakpaul
Copy link
Member Author

So, we don't have a test that passes both prompt embeds and negative prompt embeds? I believe that would have caught this, no?

I believe that could be added but the test would be about checking if an error is thrown if prompt and prompt_embeds are both passed and other adjacent areas.

@sayakpaul
Copy link
Member Author

Sorry, I have additional updates to make. Will let you all know here after that.

@sayakpaul sayakpaul changed the title [Cog V2V] fix positional arguments in check_inputs(). [Cog] some minor fixes and nits Sep 19, 2024
Comment on lines 580 to -591

height = height or self.transformer.config.sample_size * self.vae_scale_factor_spatial
width = width or self.transformer.config.sample_size * self.vae_scale_factor_spatial
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need it because height and width are already at their default values.

Comment on lines +210 to +212
self.vae_scaling_factor_image = (
self.vae.config.scaling_factor if hasattr(self, "vae") and self.vae is not None else 0.7
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is beneficial for scenarios where we want to run the pipeline without the VAE.

@sayakpaul sayakpaul requested a review from a-r-r-o-w September 21, 2024 01:38
Copy link
Member

@a-r-r-o-w a-r-r-o-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for the fixes

@sayakpaul sayakpaul marked this pull request as ready for review September 21, 2024 03:29
@sayakpaul sayakpaul requested a review from yiyixuxu September 21, 2024 03:29
@@ -188,6 +188,9 @@ def __init__(
self.vae_scale_factor_temporal = (
self.vae.config.temporal_compression_ratio if hasattr(self, "vae") and self.vae is not None else 4
)
self.vae_scaling_factor_image = (
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this so that a pipeline can operate with the vae.

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@sayakpaul sayakpaul merged commit ba5af5a into main Sep 23, 2024
18 checks passed
@sayakpaul sayakpaul deleted the correct-args-cog-v2v branch September 23, 2024 05:57
@FurkanGozukara
Copy link

FurkanGozukara commented Sep 27, 2024

thank you for amazing work

i am using the code that is running here : https://huggingface.co/spaces/THUDM/CogVideoX-5B-Space

however I am getting below errors both in text to video

and also video to video

could there have been something broken the pipe?

pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=default_dtype).to("cpu")
pipe.scheduler = CogVideoXDPMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
    else:
        pipe.to(device)
        pipe.transformer = transformer
        pipe.vae = vae
        pipe.text_encoder = text_encoder

        if use_cpu_offload:
            pipe.enable_sequential_cpu_offload()
        if use_slicing:
            pipe.vae.enable_slicing()
        if use_tiling:
            pipe.vae.enable_tiling()

        video_pt = pipe(
            prompt=prompt,
            num_videos_per_prompt=1,
            num_inference_steps=num_inference_steps,
            num_frames=49,
            use_dynamic_cfg=True,
            output_type="pt",
            guidance_scale=guidance_scale,
            generator=torch.Generator(device="cpu").manual_seed(seed),
        ).frames
        gc.collect()
    return (video_pt, seed)

image

@sayakpaul
Copy link
Member Author

Open a separate issue with a minimal fully reproducible code snippet. https://github.com/user-attachments/files/17161626/app.txt is quite long and is not minimal and the snippet you provided is incomplete.

@FurkanGozukara
Copy link

@sayakpaul current diffusers doesn't have this?
CogVideoXLoraLoaderMixin

it can't find it installed latest one on Windows

0.30.3

@a-r-r-o-w
Copy link
Member

v0.30.3 only released the image-to-video and video-to-video pipelines. LoRA support will be added in a future release. For now, you can use the lora loader mixin via the main branch

leisuzz pushed a commit to leisuzz/diffusers that referenced this pull request Oct 11, 2024
* fix positional arguments in check_inputs().

* add video and latetns to check_inputs().

* prep latents_in_channels.

* quality

* multiple fixes.

* fix
sayakpaul added a commit that referenced this pull request Dec 23, 2024
* fix positional arguments in check_inputs().

* add video and latetns to check_inputs().

* prep latents_in_channels.

* quality

* multiple fixes.

* fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants