Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clip Skip nonfunctional with SDXL-based checkpoints #387

Open
4 of 6 tasks
BlankDiploma opened this issue Feb 24, 2024 · 4 comments
Open
4 of 6 tasks

Clip Skip nonfunctional with SDXL-based checkpoints #387

BlankDiploma opened this issue Feb 24, 2024 · 4 comments

Comments

@BlankDiploma
Copy link

BlankDiploma commented Feb 24, 2024

Checklist

  • The issue exists after disabling all extensions
  • The issue exists on a clean installation of webui
  • The issue is caused by an extension, but I believe it is caused by a bug in the webui
  • The issue exists in the current version of the webui
  • The issue has not been reported before recently
  • The issue has been reported before but has not been fixed yet

What happened?

All checkpoints based off Stable Diffusion XL, including the base checkpoint, do not show any variation when changing Clip Skip, even if you set it to 12.

Image generation parameters show that the changing Clip Skip value is being recognized, it shows up in the image info text after generation is complete, but the value doesn't actually affect the output at all.

Steps to reproduce the problem

  1. Load any normal Stable Diffusion checkpoint, generate the same image with Clip Skip set to 1, 2, 12, etc.
  2. Load any Stable Diffusion XL checkpoint, generate the same image with Clip Skip set to 1, 2, 12, etc.

Observe that Stable Diffusion checkpoint properly recognizes the Clip Skip parameter during image generation, but Stable Diffusion XL checkpoints do not.

What should have happened?

Clip Skip should modify the image output normally.

What browsers do you use to access the UI ?

Google Chrome

Sysinfo

sysinfo-2024-02-24-05-27.json

Console logs

Creating venv in directory D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\venv using python "C:\Users\Blank\AppData\Local\Programs\Python\Python310\python.exe"
venv "D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\venv\Scripts\Python.exe"
Python 3.10.5 (tags/v3.10.5:f377153, Jun  6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)]
Version: f0.0.15v1.8.0rc-latest-233-g2ecb869f
Commit hash: 2ecb869f31f4abab5922c1bd611e375d5bb28e8e
Installing torch and torchvision
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu121
Collecting torch==2.1.2
  Downloading https://download.pytorch.org/whl/cu121/torch-2.1.2%2Bcu121-cp310-cp310-win_amd64.whl (2473.9 MB)
     ---------------------------------------- 2.5/2.5 GB 2.4 MB/s eta 0:00:00
Collecting torchvision==0.16.2
  Downloading https://download.pytorch.org/whl/cu121/torchvision-0.16.2%2Bcu121-cp310-cp310-win_amd64.whl (5.6 MB)
     ---------------------------------------- 5.6/5.6 MB 181.5 MB/s eta 0:00:00
Collecting sympy
  Using cached https://download.pytorch.org/whl/sympy-1.12-py3-none-any.whl (5.7 MB)
Collecting jinja2
  Downloading Jinja2-3.1.3-py3-none-any.whl (133 kB)
     ---------------------------------------- 133.2/133.2 KB 7.7 MB/s eta 0:00:00
Collecting filelock
  Using cached filelock-3.13.1-py3-none-any.whl (11 kB)
Collecting fsspec
  Downloading fsspec-2024.2.0-py3-none-any.whl (170 kB)
     ---------------------------------------- 170.9/170.9 KB 10.0 MB/s eta 0:00:00
Collecting networkx
  Downloading https://download.pytorch.org/whl/networkx-3.2.1-py3-none-any.whl (1.6 MB)
     ---------------------------------------- 1.6/1.6 MB 109.2 MB/s eta 0:00:00
Collecting typing-extensions
  Downloading typing_extensions-4.9.0-py3-none-any.whl (32 kB)
Collecting pillow!=8.3.*,>=5.3.0
  Downloading https://download.pytorch.org/whl/pillow-10.2.0-cp310-cp310-win_amd64.whl (2.6 MB)
     ---------------------------------------- 2.6/2.6 MB 174.1 MB/s eta 0:00:00
Collecting requests
  Using cached requests-2.31.0-py3-none-any.whl (62 kB)
Collecting numpy
  Downloading numpy-1.26.4-cp310-cp310-win_amd64.whl (15.8 MB)
     ---------------------------------------- 15.8/15.8 MB 162.5 MB/s eta 0:00:00
Collecting MarkupSafe>=2.0
  Downloading MarkupSafe-2.1.5-cp310-cp310-win_amd64.whl (17 kB)
Collecting urllib3<3,>=1.21.1
  Downloading urllib3-2.2.1-py3-none-any.whl (121 kB)
     ---------------------------------------- 121.1/121.1 KB 6.9 MB/s eta 0:00:00
Collecting certifi>=2017.4.17
  Downloading certifi-2024.2.2-py3-none-any.whl (163 kB)
     ---------------------------------------- 163.8/163.8 KB ? eta 0:00:00
Collecting idna<4,>=2.5
  Downloading idna-3.6-py3-none-any.whl (61 kB)
     ---------------------------------------- 61.6/61.6 KB 3.2 MB/s eta 0:00:00
Collecting charset-normalizer<4,>=2
  Using cached charset_normalizer-3.3.2-cp310-cp310-win_amd64.whl (100 kB)
Collecting mpmath>=0.19
  Using cached https://download.pytorch.org/whl/mpmath-1.3.0-py3-none-any.whl (536 kB)
Installing collected packages: mpmath, urllib3, typing-extensions, sympy, pillow, numpy, networkx, MarkupSafe, idna, fsspec, filelock, charset-normalizer, certifi, requests, jinja2, torch, torchvision
Successfully installed MarkupSafe-2.1.5 certifi-2024.2.2 charset-normalizer-3.3.2 filelock-3.13.1 fsspec-2024.2.0 idna-3.6 jinja2-3.1.3 mpmath-1.3.0 networkx-3.2.1 numpy-1.26.4 pillow-10.2.0 requests-2.31.0 sympy-1.12 torch-2.1.2+cu121 torchvision-0.16.2+cu121 typing-extensions-4.9.0 urllib3-2.2.1
WARNING: You are using pip version 22.0.4; however, version 24.0 is available.
You should consider upgrading via the 'D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\venv\Scripts\python.exe -m pip install --upgrade pip' command.
Installing clip
Installing open_clip
Cloning assets into D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\repositories\stable-diffusion-webui-assets...
Cloning into 'D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\repositories\stable-diffusion-webui-assets'...
remote: Enumerating objects: 20, done.
remote: Counting objects: 100% (20/20), done.
remote: Compressing objects: 100% (18/18), done.
remote: Total 20 (delta 0), reused 20 (delta 0), pack-reused 0
Receiving objects: 100% (20/20), 132.70 KiB | 2.07 MiB/s, done.
Cloning Stable Diffusion into D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\repositories\stable-diffusion-stability-ai...
Cloning into 'D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\repositories\stable-diffusion-stability-ai'...
remote: Enumerating objects: 580, done.
remote: Counting objects: 100% (357/357), done.
remote: Compressing objects: 100% (128/128), done.
remote: Total 580 (delta 260), reused 229 (delta 229), pack-reused 223
Receiving objects:  95% (551/580), 54.71 MiB | 36.35 MiB/s
Receiving objects: 100% (580/580), 73.44 MiB | 37.43 MiB/s, done.
Resolving deltas: 100% (279/279), done.
Cloning Stable Diffusion XL into D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\repositories\generative-models...
Cloning into 'D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\repositories\generative-models'...
remote: Enumerating objects: 871, done.
remote: Counting objects: 100% (500/500), done.
remote: Compressing objects: 100% (235/235), done.
remote: Total 871 (delta 375), reused 270 (delta 264), pack-reused 371
Receiving objects: 100% (871/871), 42.67 MiB | 27.14 MiB/s, done.
Resolving deltas: 100% (452/452), done.
Cloning K-diffusion into D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\repositories\k-diffusion...
Cloning into 'D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\repositories\k-diffusion'...
remote: Enumerating objects: 1340, done.
remote: Counting objects: 100% (622/622), done.
remote: Compressing objects: 100% (86/86), done.

Receiving objects: 100% (1340/1340), 242.04 KiB | 1.47 MiB/s, done.
Resolving deltas: 100% (939/939), done.
Cloning BLIP into D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\repositories\BLIP...
Cloning into 'D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\repositories\BLIP'...
remote: Enumerating objects: 277, done.
remote: Counting objects: 100% (165/165), done.
remote: Compressing objects: 100% (30/30), done.
Receiving objects: 100% (277/277)used 136 (delta 135), pack-reused 112
Receiving objects: 100% (277/277), 7.03 MiB | 23.31 MiB/s, done.
Resolving deltas: 100% (152/152), done.
Installing requirements
Installing forge_legacy_preprocessor requirement: fvcore
Installing forge_legacy_preprocessor requirement: mediapipe
Installing forge_legacy_preprocessor requirement: onnxruntime
Installing forge_legacy_preprocessor requirement: svglib
Installing forge_legacy_preprocessor requirement: insightface
Installing forge_legacy_preprocessor requirement: handrefinerportable
Installing forge_legacy_preprocessor requirement: depth_anything
Launching Web UI with arguments:
Total VRAM 24563 MB, total RAM 130983 MB
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4090 : native
Hint: your device supports --pin-shared-memory for potential speed improvements.
Hint: your device supports --cuda-malloc for potential speed improvements.
VAE dtype: torch.bfloat16
Using pytorch cross attention
Downloading: "https://huggingface.co/lllyasviel/fav_models/resolve/main/fav/realisticVisionV51_v51VAE.safetensors" to D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\models\Stable-diffusion\realisticVisionV51_v51VAE.safetensors

100%|██████████████████████████████████████████████████████████████████████████████| 1.99G/1.99G [00:08<00:00, 254MB/s]
ControlNet preprocessor location: D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\models\ControlNetPreprocessor
Calculating sha256 for D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\models\Stable-diffusion\realisticVisionV51_v51VAE.safetensors: 2024-02-23 21:36:20,692 - ControlNet - INFO - ControlNet UI callback registered.
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 420.4s (prepare environment: 393.6s, import torch: 6.3s, import gradio: 1.8s, setup paths: 2.5s, initialize shared: 0.4s, other imports: 2.4s, list SD models: 8.6s, load scripts: 3.3s, create ui: 0.7s, gradio launch: 0.4s).
15012c538f503ce2ebfc2c8547b268c75ccdaff7a281db55399940ff1d70e21d
Loading weights [15012c538f] from D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\models\Stable-diffusion\realisticVisionV51_v51VAE.safetensors
model_type EPS
UNet ADM Dimension 0
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale'}
To load target model SD1ClipModel
Begin to load 1 model
Reuse 0 loaded models
[Memory Management] Current Free Memory (MB) =  22981.9990234375
[Memory Management] Model Memory (MB) =  454.2076225280762
[Memory Management] Estimated Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining Memory (MB) =  21503.791400909424
Moving model(s) has taken 0.50 seconds
Model loaded in 5.6s (calculate hash: 3.4s, forge load real models: 1.3s, calculate empty prompt: 0.8s).
Calculating sha256 for D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\models\Stable-diffusion\sd_xl_base_1.0.safetensors: 31e35c80fc4829d14f90153f4c74cd59c90b779f6afe05a74cd6120b893f7e5b
Loading weights [31e35c80fc] from D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\models\Stable-diffusion\sd_xl_base_1.0.safetensors
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'}
To load target model SDXLClipModel
Begin to load 1 model
Reuse 0 loaded models
[Memory Management] Current Free Memory (MB) =  22610.3603515625
[Memory Management] Model Memory (MB) =  2144.3546981811523
[Memory Management] Estimated Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining Memory (MB) =  19442.005653381348
Moving model(s) has taken 0.66 seconds
Model loaded in 10.2s (unload existing model: 0.3s, calculate hash: 5.3s, forge load real models: 3.7s, calculate empty prompt: 0.8s).
Downloading VAEApprox model to: D:\Stable Diffusion\sdi\stable-diffusion-webui-forge\models\VAE-approx\vaeapprox-sdxl.pt
100%|███████████████████████████████████████████████████████████████████████████████| 209k/209k [00:00<00:00, 15.3MB/s]
To load target model SDXL
Begin to load 1 model
Reuse 0 loaded models
[Memory Management] Current Free Memory (MB) =  21133.46533203125
[Memory Management] Model Memory (MB) =  4897.086494445801
[Memory Management] Estimated Inference Memory (MB) =  1310.72
[Memory Management] Estimated Remaining Memory (MB) =  14925.65883758545
Moving model(s) has taken 1.16 seconds
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00,  5.46it/s]
To load target model AutoencoderKL█████████████████████████████████████████████████████| 20/20 [00:03<00:00,  5.99it/s]
Begin to load 1 model
Reuse 0 loaded models
[Memory Management] Current Free Memory (MB) =  16114.41162109375
[Memory Management] Model Memory (MB) =  159.55708122253418
[Memory Management] Estimated Inference Memory (MB) =  4356.0
[Memory Management] Estimated Remaining Memory (MB) =  11598.854539871216
Moving model(s) has taken 0.42 seconds
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:04<00:00,  4.61it/s]
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00,  5.66it/s]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:04<00:00,  4.66it/s]
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00,  5.51it/s]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:04<00:00,  4.61it/s]
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00,  5.65it/s]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:04<00:00,  4.67it/s]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:04<00:00,  5.98it/s]

Additional information

No response

@catboxanon
Copy link
Collaborator

catboxanon commented Feb 24, 2024

At least upstream, it was determined CLIP skip should not affect the output of any SDXL generations since all models were trained using the penultimate layer (which was not the case for SD1). AUTOMATIC1111/stable-diffusion-webui#12518 (comment)

@BlankDiploma
Copy link
Author

At least upstream, it was determined CLIP skip should not affect the output of any SDXL generations since all models were trained using the penultimate layer (which was not the case for SD1). AUTOMATIC1111/stable-diffusion-webui#12518 (comment)

Oh. Well, that certainly explains it. Maybe it should be hidden from the UI when an SDXL model is loaded? It's at least a bit misleading the way it's currently displayed.

@lllyasviel
Copy link
Owner

i can fix it in 5 minutes but forge will try to get same result with webui

however if most users vote for a functional sdxl clip skip but upstream refuse to have it, Forge may still implement it after multiple user reports because it can be seen as a part of backend thing

@catboxanon catboxanon changed the title [Bug]: Clip Skip completely nonfunctional with SDXL-based checkpoints. Clip Skip nonfunctional with SDXL-based checkpoints Feb 26, 2024
@mirh
Copy link

mirh commented Oct 7, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants