[Dance Diffusion] Add dance diffusion #803

patrickvonplaten · 2022-10-11T11:19:02Z

cc @apolinario to monitor progress

Checkpoints are uploaded here: https://huggingface.co/harmonai

Maestro Pipeline can be tested with:

from diffusers import DiffusionPipeline
import scipy.io.wavfile

pipe = DiffusionPipeline.from_pretrained("harmonai/maestro-150k")
pipe = pipe.to("cuda")

audios = pipe(num_inference_steps=100, sample_length_in_s=4.0).audios

scipy.io.wavfile.write("maestro_test.wav", pipe.unet.sample_rate, audios)

It relies on the DanceDiffusionPipeline the IPNDMScheduler and the UNet1DModel classes.

TODO

Convert model weights and successfully port model
Add scheduler
Add pipeline

HuggingFaceDocBuilderDev · 2022-10-11T11:22:14Z

The documentation is not available anymore as the PR was closed or merged.

src/diffusers/models/unet_2d_condition_flax.py

patrickvonplaten · 2022-10-11T12:49:06Z

src/diffusers/models/unet_1d_blocks.py

+        return self.main(input) + self.skip(input)
+
+
+def get_down_block(down_block_type, c, c_prev):


@natolambert - similar to unet_2d_blocks we'll have a new unet_1d_blocks.py file where you can define very customizable unet classes

natolambert · 2022-10-12T17:39:10Z

This model is in concurrent development with #105.

natolambert · 2022-10-18T18:47:16Z

src/diffusers/models/unet_1d.py

@@ -70,8 +70,9 @@ def __init__(
        self.sample_size = sample_size

        # time
-        self.time_proj = GaussianFourierProjection(embedding_size=8)
-        del self.time_proj.W
+        self.time_proj = GaussianFourierProjection(


Should we format this like in the 2d class?

Also, no embedding after projection?

diffusers/src/diffusers/models/unet_2d.py

Lines 101 to 109 in 6cbb73b

if time_embedding_type == "fourier":

self.time_proj = GaussianFourierProjection(embedding_size=block_out_channels[0], scale=16)

timestep_input_dim = 2 * block_out_channels[0]

elif time_embedding_type == "positional":

self.time_proj = Timesteps(block_out_channels[0], flip_sin_to_cos, freq_shift)

timestep_input_dim = block_out_channels[0]

self.time_embedding = TimestepEmbedding(timestep_input_dim, time_embed_dim)

Think as soon as we have more than GaussianFourier, let's do it - before it's maybe not necessary

The RL unet1d is different, was prepping for that.

natolambert · 2022-10-18T18:47:53Z

src/diffusers/models/unet_1d.py

@@ -132,24 +133,24 @@ def forward(
            otherwise a `tuple`. When returning a tuple, the first element is the sample tensor.
        """
        # 1. time
-        timestep_embed = self.time_proj(timestep[:, None])[..., None].repeat([1, 1, sample.shape[2]])
+        timestep_embed = self.time_proj(timestep)[..., None]
+        timestep_embed = timestep_embed.repeat([1, 1, sample.shape[2]])

        sample = torch.cat([sample, timestep_embed], dim=1)


This logic should maybe go in the block rather than the forward?

src/diffusers/models/unet_1d.py

patrickvonplaten · 2022-10-20T15:56:09Z

tests/test_models_unet_1d.py

+
+class UnetModel1DTests(unittest.TestCase):
+    @slow
+    def test_unet_1d_maestro(self):


@natolambert this test needs to pass

…diffusers into add_dance_diffusion

src/diffusers/pipelines/dance_diffusion/pipeline_dance_diffusion.py

src/diffusers/schedulers/scheduling_ipndm.py

src/diffusers/models/unet_1d_blocks.py

src/diffusers/models/unet_1d.py

scripts/convert_dance_diffusion_to_diffusers.py

src/diffusers/models/unet_1d.py

patil-suraj

Thanks a lot for adding this model! Looks very good, just left some comments mostly related to docs.

src/diffusers/models/embeddings.py

src/diffusers/models/unet_1d.py

src/diffusers/models/unet_1d_blocks.py

src/diffusers/pipelines/dance_diffusion/pipeline_dance_diffusion.py

src/diffusers/schedulers/scheduling_ipndm.py

tests/pipelines/dance_diffusion/test_dance_diffusion.py

src/diffusers/models/unet_1d.py

Co-authored-by: Suraj Patil <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Nathan Lambert <[email protected]> Co-authored-by: Anton Lozhkov <[email protected]>

…diffusers into add_dance_diffusion

…add_dance_diffusion

…diffusers into add_dance_diffusion

patrickvonplaten · 2022-10-25T16:39:42Z

Ran the whole slow tests suite and everything passed

* start * add more logic * Update src/diffusers/models/unet_2d_condition_flax.py * match weights * up * make model work * making class more general, fixing missed file rename * small fix * make new conversion work * up * finalize conversion * up * first batch of variable renamings * remove c and c_prev var names * add mid and out block structure * add pipeline * up * finish conversion * finish * upload * more fixes * Apply suggestions from code review * add attr * up * uP * up * finish tests * finish * uP * finish * fix test * up * naming consistency in tests * Apply suggestions from code review Co-authored-by: Suraj Patil <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Nathan Lambert <[email protected]> Co-authored-by: Anton Lozhkov <[email protected]> * remove hardcoded 16 * Remove bogus * fix some stuff * finish * improve logging * docs * upload Co-authored-by: Nathan Lambert <[email protected]> Co-authored-by: Suraj Patil <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Nathan Lambert <[email protected]> Co-authored-by: Anton Lozhkov <[email protected]>

start

635f005

patrickvonplaten requested a review from natolambert October 11, 2022 11:19

add more logic

3103f9f

patrickvonplaten commented Oct 11, 2022

View reviewed changes

src/diffusers/models/unet_2d_condition_flax.py Outdated Show resolved Hide resolved

Update src/diffusers/models/unet_2d_condition_flax.py

2d6d178

patrickvonplaten commented Oct 11, 2022

View reviewed changes

natolambert mentioned this pull request Oct 12, 2022

Add UNet 1d for RL model for planning + colab #105

Merged

2 tasks

patrickvonplaten added 4 commits October 18, 2022 17:45

match weights

3f0a8f8

up

f882c33

up

a353452

make model work

37addaa

natolambert reviewed Oct 18, 2022

View reviewed changes

src/diffusers/models/unet_1d.py Show resolved Hide resolved

natolambert and others added 6 commits October 18, 2022 14:37

making class more general, fixing missed file rename

05b4a0b

small fix

1697eec

make new conversion work

1a019c3

up

a2bf35b

finalize conversion

f7220cf

up

5b1e292

patrickvonplaten commented Oct 20, 2022

View reviewed changes

natolambert and others added 8 commits October 20, 2022 14:46

first batch of variable renamings

a9f111b

remove c and c_prev var names

8320ff6

add mid and out block structure

3cd030f

add pipeline

0303a4d

Merge branch 'add_dance_diffusion' of https://github.com/huggingface/…

c9ea2c1

…diffusers into add_dance_diffusion

up

20dee8d

finish conversion

a5764dc

finish

077406c

patrickvonplaten requested review from pcuenca, patil-suraj and anton-l October 24, 2022 14:13

up

d1ec608

natolambert reviewed Oct 24, 2022

View reviewed changes

naming consistency in tests

9ce38e6

zqevans reviewed Oct 24, 2022

View reviewed changes

src/diffusers/models/unet_1d.py Outdated Show resolved Hide resolved

zqevans reviewed Oct 24, 2022

View reviewed changes

scripts/convert_dance_diffusion_to_diffusers.py Outdated Show resolved Hide resolved

zqevans reviewed Oct 24, 2022

View reviewed changes

src/diffusers/models/unet_1d.py Outdated Show resolved Hide resolved

patil-suraj approved these changes Oct 25, 2022

View reviewed changes

anton-l reviewed Oct 25, 2022

View reviewed changes

tests/pipelines/dance_diffusion/test_dance_diffusion.py Outdated Show resolved Hide resolved

anton-l reviewed Oct 25, 2022

View reviewed changes

tests/pipelines/dance_diffusion/test_dance_diffusion.py Show resolved Hide resolved

anton-l approved these changes Oct 25, 2022

View reviewed changes

pcuenca approved these changes Oct 25, 2022

View reviewed changes

src/diffusers/models/unet_1d.py Outdated Show resolved Hide resolved

src/diffusers/models/unet_1d.py Outdated Show resolved Hide resolved

src/diffusers/models/unet_1d.py Show resolved Hide resolved

patrickvonplaten and others added 12 commits October 25, 2022 14:43

Apply suggestions from code review

f4d3e59

Co-authored-by: Suraj Patil <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Nathan Lambert <[email protected]> Co-authored-by: Anton Lozhkov <[email protected]>

remove hardcoded 16

b763d80

Remove bogus

e0744ee

Merge branch 'add_dance_diffusion' of https://github.com/huggingface/…

8539305

…diffusers into add_dance_diffusion

fix some stuff

6b2196a

Merge branch 'main' of https://github.com/huggingface/diffusers into …

48648a4

…add_dance_diffusion

finish

fbeeeaf

improve logging

4b0cc18

Merge branch 'add_dance_diffusion' of https://github.com/huggingface/…

5bea0a2

…diffusers into add_dance_diffusion

docs

58d7f16

Merge branch 'add_dance_diffusion' of https://github.com/huggingface/…

63c1e41

…diffusers into add_dance_diffusion

upload

cf79361

patrickvonplaten merged commit 88fa6b7 into main Oct 25, 2022

patrickvonplaten deleted the add_dance_diffusion branch October 25, 2022 16:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dance Diffusion] Add dance diffusion #803

[Dance Diffusion] Add dance diffusion #803

patrickvonplaten commented Oct 11, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 11, 2022 •

edited

Loading

patrickvonplaten Oct 11, 2022

natolambert commented Oct 12, 2022

natolambert Oct 18, 2022

patrickvonplaten Oct 20, 2022

natolambert Oct 20, 2022

natolambert Oct 18, 2022

patrickvonplaten Oct 20, 2022

patil-suraj left a comment

patrickvonplaten commented Oct 25, 2022

		return self.main(input) + self.skip(input)


		def get_down_block(down_block_type, c, c_prev):

	if time_embedding_type == "fourier":
	self.time_proj = GaussianFourierProjection(embedding_size=block_out_channels[0], scale=16)
	timestep_input_dim = 2 * block_out_channels[0]
	elif time_embedding_type == "positional":
	self.time_proj = Timesteps(block_out_channels[0], flip_sin_to_cos, freq_shift)
	timestep_input_dim = block_out_channels[0]

	self.time_embedding = TimestepEmbedding(timestep_input_dim, time_embed_dim)

[Dance Diffusion] Add dance diffusion #803

[Dance Diffusion] Add dance diffusion #803

Conversation

patrickvonplaten commented Oct 11, 2022 • edited Loading

TODO

HuggingFaceDocBuilderDev commented Oct 11, 2022 • edited Loading

patrickvonplaten Oct 11, 2022

Choose a reason for hiding this comment

natolambert commented Oct 12, 2022

natolambert Oct 18, 2022

Choose a reason for hiding this comment

patrickvonplaten Oct 20, 2022

Choose a reason for hiding this comment

natolambert Oct 20, 2022

Choose a reason for hiding this comment

natolambert Oct 18, 2022

Choose a reason for hiding this comment

patrickvonplaten Oct 20, 2022

Choose a reason for hiding this comment

patil-suraj left a comment

Choose a reason for hiding this comment

patrickvonplaten commented Oct 25, 2022

patrickvonplaten commented Oct 11, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 11, 2022 •

edited

Loading