Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add comfyui flux speedup doc #1144

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

add comfyui flux speedup doc #1144

wants to merge 3 commits into from

Conversation

marigoold
Copy link
Collaborator

@marigoold marigoold commented Nov 20, 2024

add flux doc and related comfyui nodes

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced multiple new nodes in the workflow configuration, enhancing the system's capabilities.
    • Added a new section in the README for setup and usage of the OneDiff tool, including performance comparisons.
    • Implemented new functionality for optimizing machine learning models with the OneDiffFluxSpeedup and OneDiffFluxLoader classes.
  • Documentation

    • Updated README.md to provide detailed setup instructions and performance metrics.
  • Chores

    • Enhanced node loading capabilities by importing new modules for additional functionality.

Copy link
Contributor

coderabbitai bot commented Nov 20, 2024

Walkthrough

The changes involve the addition of a new workflow configuration in the JSON file for the onediff_comfy_nodes project, which introduces 17 new nodes with unique identifiers and properties. These nodes facilitate data processing and transformation tasks. Additionally, the README.md file for the FLUX project has been updated with setup instructions, usage examples, and performance comparisons between PyTorch and OneDiff, enhancing user guidance and understanding of the tool's capabilities. Modifications were also made to the __init__.py file to support the new nodes and a new file was added to implement speedup functionalities.

Changes

File Path Change Summary
onediff_comfy_nodes/benchmarks/resources/workflows/nexfort/flux_speedup.json Added 17 new nodes including DualCLIPLoader, BasicScheduler, KSamplerSelect, and others with unique IDs and properties.
onediff_comfy_nodes/docs/FLUX/README.md Updated README to include setup instructions, usage examples, and performance comparisons for the onediff tool.
onediff_comfy_nodes/init.py Modified lazy_load_extra_nodes to import nodes_flux_speedup and update node mappings.
onediff_comfy_nodes/extras_nodes/nodes_flux_speedup.py Introduced a new file with classes and methods for optimizing models, including OneDiffFluxSpeedup and OneDiffFluxLoader.

Suggested reviewers

  • jackalcooper

Poem

In the land of nodes where data flows,
New friends arrive, as the workflow grows.
With onediff's charm, we leap and bound,
Enhancing our tasks, with joy profound.
So hop along, let the processing start,
For every new node plays a vital part! 🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (5)
onediff_comfy_nodes/docs/FLUX/README.md (2)

2-2: Format the URL as a proper Markdown link.

Replace the bare URL with a properly formatted Markdown link.

-huggingface: https://huggingface.co/black-forest-labs/FLUX.1-dev
+huggingface: [FLUX.1-dev on Hugging Face](https://huggingface.co/black-forest-labs/FLUX.1-dev)
🧰 Tools
🪛 Markdownlint

2-2: null
Bare URL used

(MD034, no-bare-urls)


21-21: Format the URL as a proper Markdown link.

Replace the bare URL with a properly formatted Markdown link.

-- step1: Get User Access Tokens here https://huggingface.co/settings/tokens
+- step1: Get User Access Tokens from the [Hugging Face Token Settings](https://huggingface.co/settings/tokens)
🧰 Tools
🪛 Markdownlint

21-21: null
Bare URL used

(MD034, no-bare-urls)

onediff_comfy_nodes/benchmarks/resources/workflows/nexfort/flux_speedup.json (3)

682-686: Consider using environment variables or configuration files for model paths.

The model paths are hardcoded in the configuration:

  • flux1-dev.sft
  • ae.sft
  • t5xxl_fp16.safetensors
  • clip_l.safetensors

This could make the workflow less portable across different environments.

Consider:

  1. Using environment variables for model paths
  2. Adding a configuration file for model paths
  3. Supporting relative paths from a base models directory

Also applies to: 717-719, 34-38


377-379: Consider breaking down the long prompt into components.

The prompt is very long and combines multiple aspects:

  • Character description
  • Outfit details
  • Scene setting
  • Lighting conditions
  • Props and actions

This could be harder to maintain and modify.

Consider breaking it down into structured components using a template system or separate configuration, making it easier to mix and match different elements.


438-443: Consider making image resolution more configurable.

Both ModelSamplingFlux and EmptySD3LatentImage use hardcoded resolution values (1024x1024). This might not be optimal for all use cases.

Consider:

  1. Making resolution configurable through environment variables or config files
  2. Adding validation for supported resolution ranges
  3. Supporting aspect ratio preservation when dimensions change

Also applies to: 491-495

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 424c81a and 99a83a6.

⛔ Files ignored due to path filters (2)
  • onediff_comfy_nodes/benchmarks/resources/workflows/nexfort/flux_speedup.png is excluded by !**/*.png
  • onediff_comfy_nodes/docs/FLUX/flux_quality_compare.png is excluded by !**/*.png
📒 Files selected for processing (2)
  • onediff_comfy_nodes/benchmarks/resources/workflows/nexfort/flux_speedup.json (1 hunks)
  • onediff_comfy_nodes/docs/FLUX/README.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
onediff_comfy_nodes/docs/FLUX/README.md

[uncategorized] ~47-~47: Possible missing preposition found.
Context: ...# WorkFlow Here is a very basic example how to use it: [workflow_flux_speedup.json]...

(AI_EN_LECTOR_MISSING_PREPOSITION)


[uncategorized] ~68-~68: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...rence, and it varies a lot on different CPU. ## Quality The following table sh...

(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)

🪛 Markdownlint
onediff_comfy_nodes/docs/FLUX/README.md

2-2: null
Bare URL used

(MD034, no-bare-urls)


21-21: null
Bare URL used

(MD034, no-bare-urls)

🔇 Additional comments (2)
onediff_comfy_nodes/benchmarks/resources/workflows/nexfort/flux_speedup.json (2)

1-4: LGTM: File structure and versioning are well-defined.

The configuration follows the expected JSON structure with proper versioning (0.4) and tracking of node/link IDs.

Also applies to: 868-870


722-867: Verify node connections and memory management.

The workflow contains complex node connections that might impact memory usage. The note mentions potential memory issues that can be addressed using fp8 types.

✅ Verification successful

Node connections and memory management are correctly configured

Based on the workflow file and documentation:

  1. The workflow uses the recommended FLUX configuration with proper node connections
  2. Memory management is addressed through the available fp8 option:
    • The note in the workflow explicitly mentions "You can set the weight_dtype to one of the fp8 types if you have memory issues"
    • Documentation confirms fp8 support through t5xxl_fp8_e4m3fn.safetensors as an alternative to fp16

The connections between nodes follow the expected pattern for FLUX workflows, and the memory optimization suggestion is properly documented in the workflow notes.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential memory issues in the workflow

# Look for memory-related configurations and comments
rg -i "memory|fp8|dtype" --type json

# Check for similar workflows to compare memory handling
fd -e json . | xargs grep -l "OneDiffFluxLoader"

Length of output: 236


Script:

#!/bin/bash
# Let's try a different approach to analyze the workflow configuration

# First, check the content of the workflow file
cat onediff_comfy_nodes/benchmarks/resources/workflows/nexfort/flux_speedup.json

# Look for any documentation or configuration files that might explain the node types
fd -e md -e txt . | xargs grep -l -i "node.*connection\|memory\|workflow"

# Search for similar workflow files
fd -e json . | grep -v "package.json\|tsconfig.json"

Length of output: 21358


Script:

#!/bin/bash
# Let's check the documentation for memory-related information and node connections

# Check FLUX documentation which seems relevant to this workflow
cat onediff_comfy_nodes/docs/FLUX/README.md

# Check the main README for any relevant information
cat onediff_comfy_nodes/README.md

# Look specifically for fp8 related configurations
rg -i "fp8|weight_dtype" onediff_comfy_nodes/

Length of output: 9498

Comment on lines +75 to +76
The following table shows the comparison of the plot, seed=0, Baseline (non optimized) on the left, and OneDiff (optimized) on the right
![quality compare](./flux_quality_compare.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Missing quality comparison image referenced in documentation

The referenced image flux_quality_compare.png is missing from the expected path. This broken image reference affects the documentation's completeness and readability.

  • onediff_comfy_nodes/docs/FLUX/README.md: Add the missing image file flux_quality_compare.png or update the documentation to reference the correct image path.
🔗 Analysis chain

Verify the quality comparison image exists.

Ensure the referenced image file exists and is accessible.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify the quality comparison image exists
if [ ! -f "docs/FLUX/flux_quality_compare.png" ]; then
    echo "Warning: Quality comparison image does not exist"
fi

Length of output: 158

Comment on lines +47 to +49
Here is a very basic example how to use it:
[workflow_flux_speedup.json](../../benchmarks/resources/workflows/nexfort/flux_speedup.json)
![flux_speedup_workflow](../../benchmarks/resources/workflows/nexfort/flux_speedup.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Fix grammar and broken file references

The grammar issue needs to be fixed, and the referenced files are missing in the specified paths. Please update both the grammar and file references:

  1. Fix the grammar by adding "of" after "example"
  2. The following referenced files are missing and need to be updated with correct paths:
    • benchmarks/resources/workflows/nexfort/flux_speedup.json
    • benchmarks/resources/workflows/nexfort/flux_speedup.png
🔗 Analysis chain

Fix grammar and verify referenced files.

  1. The sentence needs a preposition "of" after "example".
  2. Ensure the referenced workflow file and image exist in the specified paths.
-Here is a very basic example how to use it:
+Here is a very basic example of how to use it:
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify referenced files exist
files_to_check=(
    "benchmarks/resources/workflows/nexfort/flux_speedup.json"
    "benchmarks/resources/workflows/nexfort/flux_speedup.png"
)

for file in "${files_to_check[@]}"; do
    if [ ! -f "$file" ]; then
        echo "Warning: Referenced file $file does not exist"
    fi
done

Length of output: 760

🧰 Tools
🪛 LanguageTool

[uncategorized] ~47-~47: Possible missing preposition found.
Context: ...# WorkFlow Here is a very basic example how to use it: [workflow_flux_speedup.json]...

(AI_EN_LECTOR_MISSING_PREPOSITION)

Comment on lines +8 to +15
COMFYUI_DIR=$pwd/ComfyUI
# install ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI.git

# install onediff & onediff_comfy_nodes
git clone https://github.com/siliconflow/onediff.git
cd onediff && pip install -r onediff_comfy_nodes/FLUX/requirements.txt && pip install -e .
ln -s $pwd/onediff/onediff_comfy_nodes $COMFYUI_DIR/custom_nodes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix shell command syntax and enhance setup instructions.

  1. The $pwd variable usage is incorrect. In shell scripts, use $(pwd) or ${PWD}.
  2. Consider adding prerequisite checks for Python version and git.
-COMFYUI_DIR=$pwd/ComfyUI
+# Check Python version
+python3 --version  # Should be 3.10
+
+# Set up directory
+COMFYUI_DIR=$(pwd)/ComfyUI
 
 # install ComfyUI
 git clone https://github.com/comfyanonymous/ComfyUI.git
 
 # install onediff & onediff_comfy_nodes
 git clone https://github.com/siliconflow/onediff.git
 cd onediff && pip install -r onediff_comfy_nodes/FLUX/requirements.txt && pip install -e .
-ln -s $pwd/onediff/onediff_comfy_nodes  $COMFYUI_DIR/custom_nodes
+ln -s $(pwd)/onediff/onediff_comfy_nodes $COMFYUI_DIR/custom_nodes
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
COMFYUI_DIR=$pwd/ComfyUI
# install ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI.git
# install onediff & onediff_comfy_nodes
git clone https://github.com/siliconflow/onediff.git
cd onediff && pip install -r onediff_comfy_nodes/FLUX/requirements.txt && pip install -e .
ln -s $pwd/onediff/onediff_comfy_nodes $COMFYUI_DIR/custom_nodes
# Check Python version
python3 --version # Should be 3.10
# Set up directory
COMFYUI_DIR=$(pwd)/ComfyUI
# install ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI.git
# install onediff & onediff_comfy_nodes
git clone https://github.com/siliconflow/onediff.git
cd onediff && pip install -r onediff_comfy_nodes/FLUX/requirements.txt && pip install -e .
ln -s $(pwd)/onediff/onediff_comfy_nodes $COMFYUI_DIR/custom_nodes

Comment on lines +25 to +37
export ACCESS_TOKEN="User Access Tokens"
wget --header="Authorization: Bearer $ACCESS_TOKEN" \
https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors -O models/checkpoints/flux1-dev.safetensors

wget --header="Authorization: Bearer $ACCESS_TOKEN" \
https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/text_encoders/clip_l.safetensors -O models/clip/clip_l.safetensors

# wget --header="Authorization: Bearer $ACCESS_TOKEN" \
# https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/text_encoders/t5xxl_fp16.safetensors -O models/clip/t5xxl_fp16.safetensors

wget --header="Authorization: Bearer $ACCESS_TOKEN" \
https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/text_encoders/t5xxl_fp8_e4m3fn.safetensors -O models/clip/t5xxl_fp8_e4m3fn.safetensors
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance security and clarity of model download instructions.

  1. Consider using environment variables or a configuration file for token management.
  2. The commented-out wget command should either be removed or explained.
-export ACCESS_TOKEN="User Access Tokens"
+# Load token from environment or .env file
+if [ -z "$HF_ACCESS_TOKEN" ]; then
+    echo "Please set HF_ACCESS_TOKEN environment variable"
+    exit 1
+fi

 wget --header="Authorization: Bearer $ACCESS_TOKEN" \
 https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors -O models/checkpoints/flux1-dev.safetensors

 wget --header="Authorization: Bearer $ACCESS_TOKEN" \
 https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/text_encoders/clip_l.safetensors -O models/clip/clip_l.safetensors

-# wget --header="Authorization: Bearer $ACCESS_TOKEN" \
-# https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/text_encoders/t5xxl_fp16.safetensors -O models/clip/t5xxl_fp16.safetensors
+# Note: FP16 model is optional and can be used instead of FP8 model if higher precision is needed

 wget --header="Authorization: Bearer $ACCESS_TOKEN" \
 https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/text_encoders/t5xxl_fp8_e4m3fn.safetensors -O models/clip/t5xxl_fp8_e4m3fn.safetensors
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export ACCESS_TOKEN="User Access Tokens"
wget --header="Authorization: Bearer $ACCESS_TOKEN" \
https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors -O models/checkpoints/flux1-dev.safetensors
wget --header="Authorization: Bearer $ACCESS_TOKEN" \
https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/text_encoders/clip_l.safetensors -O models/clip/clip_l.safetensors
# wget --header="Authorization: Bearer $ACCESS_TOKEN" \
# https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/text_encoders/t5xxl_fp16.safetensors -O models/clip/t5xxl_fp16.safetensors
wget --header="Authorization: Bearer $ACCESS_TOKEN" \
https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/text_encoders/t5xxl_fp8_e4m3fn.safetensors -O models/clip/t5xxl_fp8_e4m3fn.safetensors
```
# Load token from environment or .env file
if [ -z "$HF_ACCESS_TOKEN" ]; then
echo "Please set HF_ACCESS_TOKEN environment variable"
exit 1
fi
wget --header="Authorization: Bearer $ACCESS_TOKEN" \
https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors -O models/checkpoints/flux1-dev.safetensors
wget --header="Authorization: Bearer $ACCESS_TOKEN" \
https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/text_encoders/clip_l.safetensors -O models/clip/clip_l.safetensors
# Note: FP16 model is optional and can be used instead of FP8 model if higher precision is needed
wget --header="Authorization: Bearer $ACCESS_TOKEN" \
https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/text_encoders/t5xxl_fp8_e4m3fn.safetensors -O models/clip/t5xxl_fp8_e4m3fn.safetensors

Comment on lines +59 to +68
| Data update date(yyyy-mm-dd) | 2024-11-19 |
| PyTorch E2E time | 12.44 s |
| OneDiff E2E time | 7.12 s (-42%) |
| PyTorch Max Mem Used | 20.510 GiB |
| OneDiff Max Mem Used | 23.094 GiB |
| PyTorch Warmup with Run time | 49.09 s |
| OneDiff Warmup with Compilation time<sup>1</sup> | 113.73 s |
| OneDiff Warmup with Cache time | 20.27 s |

<sup>1</sup> OneDiff Warmup with Compilation time is tested on AMD EPYC 7543 32-Core Processor CPU. Note this is just for reference, and it varies a lot on different CPU.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix date and grammar issues in performance comparison.

  1. The data update date appears to be incorrect (shows future date).
  2. Use plural form "CPUs" in the footnote.
  3. Consider adding more context about test conditions.
-| Data update date(yyyy-mm-dd)                     | 2024-11-19                            |
+| Data update date(yyyy-mm-dd)                     | 2023-11-19                            |

-<sup>1</sup> OneDiff Warmup with Compilation time is tested on  AMD EPYC 7543 32-Core Processor CPU. Note this is just for reference, and it varies a lot on different CPU.
+<sup>1</sup> OneDiff Warmup with Compilation time is tested on AMD EPYC 7543 32-Core Processor. Note this is just for reference, and it varies a lot on different CPUs.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| Data update date(yyyy-mm-dd) | 2024-11-19 |
| PyTorch E2E time | 12.44 s |
| OneDiff E2E time | 7.12 s (-42%) |
| PyTorch Max Mem Used | 20.510 GiB |
| OneDiff Max Mem Used | 23.094 GiB |
| PyTorch Warmup with Run time | 49.09 s |
| OneDiff Warmup with Compilation time<sup>1</sup> | 113.73 s |
| OneDiff Warmup with Cache time | 20.27 s |
<sup>1</sup> OneDiff Warmup with Compilation time is tested on AMD EPYC 7543 32-Core Processor CPU. Note this is just for reference, and it varies a lot on different CPU.
| Data update date(yyyy-mm-dd) | 2023-11-19 |
| PyTorch E2E time | 12.44 s |
| OneDiff E2E time | 7.12 s (-42%) |
| PyTorch Max Mem Used | 20.510 GiB |
| OneDiff Max Mem Used | 23.094 GiB |
| PyTorch Warmup with Run time | 49.09 s |
| OneDiff Warmup with Compilation time<sup>1</sup> | 113.73 s |
| OneDiff Warmup with Cache time | 20.27 s |
<sup>1</sup> OneDiff Warmup with Compilation time is tested on AMD EPYC 7543 32-Core Processor. Note this is just for reference, and it varies a lot on different CPUs.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~68-~68: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...rence, and it varies a lot on different CPU. ## Quality The following table sh...

(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)

Comment on lines +623 to +625
"widgets_values": [
"The reference sampling implementation auto adjusts the shift value based on the resolution, if you don't want this you can just bypass (CTRL-B) this ModelSamplingFlux node.\n"
],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Move setup instructions to proper documentation.

Currently, setup instructions and tips are embedded in Note nodes within the workflow. This makes them:

  1. Hard to find for new users
  2. Not searchable in documentation
  3. Not versioned properly with the code

Consider:

  1. Moving these instructions to a proper README or documentation file
  2. Adding links to the documentation in the workflow file
  3. Including version compatibility information

Also applies to: 648-650

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (3)
onediff_comfy_nodes/__init__.py (1)

50-53: Consider adding inline documentation.

Since this is part of adding flux documentation, consider adding a brief comment explaining why flux speedup is tied to nexfort availability. This would help future maintainers understand the dependency relationship.

 if is_nexfort_available():
+    # Flux speedup requires nexfort backend for optimal performance
     from .extras_nodes import nodes_nexfort_booster, nodes_flux_speedup
onediff_comfy_nodes/extras_nodes/nodes_flux_speedup.py (2)

56-68: Consider encapsulating environment variable changes within a context manager

Setting the environment variable CUBLASLT_WORKSPACE_SIZE globally can lead to unintended side effects, especially in environments where multiple processes or threads might be affected. It's recommended to encapsulate the environment variable change within a context manager to limit its scope to the necessary operations.

You can define a context manager to temporarily set the environment variable:

from contextlib import contextmanager

@contextmanager
def set_env_var(key, value):
    old_value = os.environ.get(key)
    os.environ[key] = value
    try:
        yield
    finally:
        if old_value is None:
            del os.environ[key]
        else:
            os.environ[key] = old_value

Then, modify the speedup function to use this context manager:

 def speedup(model: ModelPatcher, enable_quantize: bool, enable_transform: bool):
-    os.environ["CUBLASLT_WORKSPACE_SIZE"] = str(1024)
+    with set_env_var("CUBLASLT_WORKSPACE_SIZE", "1024"):
         if enable_quantize:
             quantize(
                 model.model.diffusion_model,
                 quant_type="fp8_e4m3_e4m3_dynamic_per_tensor",
             )
         if enable_transform:
             transform_model(model.model.diffusion_model)

70-94: Review default values for quantization and transformation

In the OneDiffFluxSpeedup class, the default values for quantize_model and transform are set to True. Ensure that these defaults align with typical user expectations. If quantization or transformation may not be desired in all cases, consider setting the defaults to False or prompting users to make an explicit choice.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 99a83a6 and 0acda03.

📒 Files selected for processing (2)
  • onediff_comfy_nodes/__init__.py (1 hunks)
  • onediff_comfy_nodes/extras_nodes/nodes_flux_speedup.py (1 hunks)
🔇 Additional comments (1)
onediff_comfy_nodes/__init__.py (1)

50-53: LGTM! Verify the nexfort dependency.

The integration of flux speedup nodes alongside nexfort booster nodes looks correct. The conditional import under is_nexfort_available() suggests that flux speedup depends on nexfort.

Let's verify this dependency relationship:

✅ Verification successful

Conditional import of flux_speedup with nexfort is justified

The verification confirms that flux_speedup has a direct dependency on nexfort:

  • It imports multiple core components from nexfort (RMSNorm4Comfy4D, MLPEmbedder, Modulation, etc.)
  • Uses nexfort's transformation and quantization capabilities
  • The documentation shows it's part of the nexfort optimization workflow

The conditional import under is_nexfort_available() is correct and necessary.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for nexfort dependency in flux speedup implementation
# Expected: Find references to nexfort in the flux speedup implementation

# Search for nexfort imports or usage in flux speedup
rg -l "nexfort" "onediff_comfy_nodes/extras_nodes/nodes_flux_speedup.py" || echo "No direct nexfort references found"

# Check if there are any other files that might define the relationship
rg -l "flux.*nexfort|nexfort.*flux" || echo "No relationship found in other files"

Length of output: 235


Script:

#!/bin/bash
# Let's check the actual nexfort usage in flux speedup implementation
rg "nexfort" "onediff_comfy_nodes/extras_nodes/nodes_flux_speedup.py" -C 2

# Also check the relationship documented in the README
cat "onediff_comfy_nodes/docs/FLUX/README.md"

Length of output: 4483

Comment on lines +118 to +128
def load_unet(self, unet_name, enable_quant, enable_transform):
model_options = {}
unet_path = folder_paths.get_full_path("diffusion_models", unet_name)
with comfy_cpu_context():
model = comfy.sd.load_diffusion_model(
unet_path, model_options=model_options
)
model.offload_device = comfy.model_management.unet_offload_device()
speedup(model, enable_quant, enable_transform)

return (model,)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ensure proper handling of model loading exceptions

When loading the diffusion model using comfy.sd.load_diffusion_model, any exceptions that occur during the loading process may not be handled, which could lead to unhandled exceptions and potentially crash the application. Consider adding try-except blocks to handle exceptions gracefully and provide informative error messages to the user.

Apply this diff to enhance error handling:

 def load_unet(self, unet_name, enable_quant, enable_transform):
     model_options = {}
     unet_path = folder_paths.get_full_path("diffusion_models", unet_name)
     with comfy_cpu_context():
-        model = comfy.sd.load_diffusion_model(
-            unet_path, model_options=model_options
-        )
+        try:
+            model = comfy.sd.load_diffusion_model(
+                unet_path, model_options=model_options
+            )
+        except Exception as e:
+            print(f"Error loading diffusion model: {e}")
+            return (None,)
     model.offload_device = comfy.model_management.unet_offload_device()
     speedup(model, enable_quant, enable_transform)

     return (model,)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def load_unet(self, unet_name, enable_quant, enable_transform):
model_options = {}
unet_path = folder_paths.get_full_path("diffusion_models", unet_name)
with comfy_cpu_context():
model = comfy.sd.load_diffusion_model(
unet_path, model_options=model_options
)
model.offload_device = comfy.model_management.unet_offload_device()
speedup(model, enable_quant, enable_transform)
return (model,)
def load_unet(self, unet_name, enable_quant, enable_transform):
model_options = {}
unet_path = folder_paths.get_full_path("diffusion_models", unet_name)
with comfy_cpu_context():
try:
model = comfy.sd.load_diffusion_model(
unet_path, model_options=model_options
)
except Exception as e:
print(f"Error loading diffusion model: {e}")
return (None,)
model.offload_device = comfy.model_management.unet_offload_device()
speedup(model, enable_quant, enable_transform)
return (model,)

@strint strint changed the title add flux doc add comfyui flux speedup doc Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants