Skip to content

The official implementation of OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows

Notifications You must be signed in to change notification settings

jacklishufan/OmniFlows

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows


This repository contains the official code and checkpoints used in the paper "OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows"

Environment Setup

conda create --name python=3.10
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install -r requirements.txt
pip install -e .

Download Checkpoints

Checkpoint (v0.5) is available on Huggingface.

Checkpoint (v0.9) is available on Huggingface.. This checkpoint is trained on additional data focusing on audio-visual correspondence.

[Coming Soon: We are training a stronger model based on MMDiT-X proposed in SDv3.5]

Inference

from omniflow import  OmniFlowPipeline

pipeline = OmniFlowPipeline.load_pretrained('ckpts/v0.5',device='cuda')

pipeline.cfg_mode = 'new'
imgs = pipeline("portrait of a cyberpunk girl with neon tattoos and a visor,staring intensely. Standing on top of a building",height=512,width=512,add_token_embed=0,task='t2i')

For more examples of Any-to-Any Generation, checkout scripts/Demo.ipynb

Training

See scripts/training.md. We also release a filtered synthethic dataset containing text-audio-image triplets at Huggingface

Citation

If you find OmniFlow useful in your research, please consider cite

@article{li2024omniflow,
  title={OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows},
  author={Li, Shufan and Kallidromitis, Konstantinos and Gokul, Akash and Liao, Zichun and Kato, Yusuke and Kozuka, Kazuki and Grover, Aditya},
  journal={arXiv preprint arXiv:2412.01169},
  year={2024}
}

About

The official implementation of OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published