Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add disco diffusion stable diffusion into taskflow #3198

Conversation

JunnYu
Copy link
Member

@JunnYu JunnYu commented Sep 5, 2022

PR types

New features

PR changes

Taskflow

Description

  • 给taskflow接入disco diffusion 和 stable diffusion
>>> from paddlenlp import Taskflow
# 默认模型为 pai-painter-painting-base-zh,
>>> text_to_image = Taskflow("text_to_image", model="pai-painter-painting-base-zh", num_return_images=2)
# 单条输入, 默认返回2张图片。
>>> images = text_to_image("风阁水帘今在眼,且来先看早梅红")
# [[<PIL.Image.Image image mode=RGB size=256x256>], [<PIL.Image.Image image mode=RGB size=256x256>]]
>>> images[0][0].save("painting-figure-1.png")
>>> images[0][1].save("painting-figure-2.png")
>>> images[0][0].argument
# argument表示生成该图片所使用的参数
# {'input': '风阁水帘今在眼,且来先看早梅红',
#  'batch_size': 1,
#  'seed': 2414128200,
#  'temperature': 1.0,
#  'top_k': 32,
#  'top_p': 1.0,
#  'condition_scale': 10.0,
#  'num_return_images': 2,
#  'use_faster': False,
#  'use_fp16_decoding': False,
#  'image_index_in_returned_images': 0}
#
# 多条输入, 返回值解释:[[第一个文本返回的第一张图片, 第一个文本返回的第二张图片], [第二个文本返回的第一张图片, 第二个文本返回的第二张图片]]
>>> image_list = text_to_image(["风阁水帘今在眼,且来先看早梅红", "见说春风偏有贺,露花千朵照庭闹"])
# [[<PIL.Image.Image image mode=RGB size=256x256>, <PIL.Image.Image image mode=RGB size=256x256>],
#  [<PIL.Image.Image image mode=RGB size=256x256>, <PIL.Image.Image image mode=RGB size=256x256>]]
>>> for batch_index, batch_image in enumerate(image_list):
# len(batch_image) == 2 (num_return_images)
>>>     for return_image_index, each_image in enumerate(batch_image):
>>>         each_image.save(f"painting-figure_{batch_index}_{return_image_index}.png")

支持多种模型

EasyNLP仓库中的pai-painter模型
>>> text_to_image = Taskflow("text_to_image", model="pai-painter-commercial-base-zh", num_return_images=2)
>>> image_list = text_to_image(["女童套头毛衣打底衫秋冬针织衫童装儿童内搭上衣", "春夏真皮工作鞋女深色软皮久站舒适上班面试职业皮鞋"])
>>> for batch_index, batch_image in enumerate(image_list):
>>>     # len(batch_image) == 2 (num_return_images)
>>>     for return_image_index, each_image in enumerate(batch_image):
>>>         each_image.save(f"commercial-figure_{batch_index}_{return_image_index}.png")
DALLE-mini模型
>>> text_to_image = Taskflow("text_to_image", model="dalle-mini", num_return_images=2)
>>> image_list = text_to_image(["New York Skyline with 'Google Research Pizza Cafe' written with fireworks on the sky.", "Dali painting of WALL·E"])
>>> for batch_index, batch_image in enumerate(image_list):
>>>     # len(batch_image) == 2 (num_return_images)
>>>     for return_image_index, each_image in enumerate(batch_image):
>>>         each_image.save(f"dalle-mini-figure_{batch_index}_{return_image_index}.png")
Disco Diffusion模型
# 注意,该模型生成速度较慢,最好返回1张图片。
>>> text_to_image = Taskflow("text_to_image", model="disco_diffusion_ernie_vil-2.0-base-zh", num_return_images=1)
>>> image_list = text_to_image("一幅美丽的睡莲池塘的画,由Adam Paquette在artstation上所做。")
>>> for batch_index, batch_image in enumerate(image_list):
>>>     for return_image_index, each_image in enumerate(batch_image):
>>>         each_image.save(f"disco_diffusion_ernie_vil-2.0-base-zh-figure_{batch_index}_{return_image_index}.png")
Stable Diffusion模型
>>> text_to_image = Taskflow("text_to_image", model="CompVis/stable-diffusion-v1-4", mode="text2image", num_return_images=2)
>>> prompt = [
    "In the morning light,Chinese ancient buildings in the mountains,Magnificent and fantastic John Howe landscape,lake,clouds,farm,Fairy tale,light effect,Dream,Greg Rutkowski,James Gurney,artstation",
    "clouds surround the mountains and Chinese palaces,sunshine,lake,overlook,overlook,unreal engine,light effect,Dream,Greg Rutkowski,James Gurney,artstation"
    ]
>>> image_list = text_to_image(prompt)
>>> for batch_index, batch_image in enumerate(image_list):
>>>     # len(batch_image) == 2 (num_return_images)
>>>     for return_image_index, each_image in enumerate(batch_image):
>>>         each_image.save(f"stable-diffusion-figure_{batch_index}_{return_image_index}.png")

支持复现生成结果 (以Stable Diffusion模型为例)

>>> from paddlenlp import Taskflow
>>> text_to_image = Taskflow("text_to_image", model="CompVis/stable-diffusion-v1-4", mode="text2image", num_return_images=2)
>>> prompt = [
    "In the morning light,Chinese ancient buildings in the mountains,Magnificent and fantastic John Howe landscape,lake,clouds,farm,Fairy tale,light effect,Dream,Greg Rutkowski,James Gurney,artstation",
    ]
>>> image_list = text_to_image(prompt)
>>> for batch_index, batch_image in enumerate(image_list):
>>>     # len(batch_image) == 2 (num_return_images)
>>>     for return_image_index, each_image in enumerate(batch_image):
>>>         each_image.save(f"stable-diffusion-figure_{batch_index}_{return_image_index}.png")
# 如果我们想复现promt[0]文本的第二张返回的结果,我们可以首先查看生成该图像所使用的参数信息。
>>> each_image.argument
# {'mode': 'text2image',
#  'seed': 2389376819,
#  'height': 512,
#  'width': 512,
#  'num_inference_steps': 50,
#  'guidance_scale': 7.5,
#  'latents': None,
#  'num_return_images': 1,
#  'input': 'In the morning light,Chinese ancient buildings in the mountains,Magnificent and fantastic John Howe landscape,lake,clouds,farm,Fairy tale,light effect,Dream,Greg Rutkowski,James Gurney,artstation'}
# 通过set_argument设置该参数。
>>> text_to_image.set_argument(each_image.argument)
>>> new_image = text_to_image(each_image.argument["input"])
# 查看生成图片的结果,可以发现最终结果与之前的图片相一致。
>>> new_image[0][0]

@JunnYu JunnYu requested a review from guoshengCS September 5, 2022 08:18
@guoshengCS guoshengCS requested a review from wawltor September 5, 2022 08:38
Copy link
Collaborator

@wawltor wawltor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wawltor wawltor merged commit caaa102 into PaddlePaddle:develop Sep 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants