initial commit

pkuliyi2015 · May 20, 2023 · 97bc1ff · 97bc1ff
commit 97bc1ff
Show file tree

Hide file tree

Showing 12 changed files with 1,711 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,7 @@
+# meta
+.vscode/
+__pycache__/
+.DS_Store
+
+# settings
+models/
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,35 @@
+S-Lab License 1.0
+
+Copyright 2022 S-Lab
+
+Redistribution and use for non-commercial purpose in source and 
+binary forms, with or without modification, are permitted provided 
+that the following conditions are met:
+
+1. Redistributions of source code must retain the above copyright 
+   notice, this list of conditions and the following disclaimer.
+
+2. Redistributions in binary form must reproduce the above copyright 
+   notice, this list of conditions and the following disclaimer in 
+   the documentation and/or other materials provided with the 
+   distribution.
+
+3. Neither the name of the copyright holder nor the names of its 
+   contributors may be used to endorse or promote products derived 
+   from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 
+HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+In the event that redistribution and/or use for commercial purpose in 
+source or binary forms, with or without modification is required, 
+please contact the contributor(s) of the work.
diff --git a/LICENSE2 b/LICENSE2
diff --git a/README.md b/README.md
@@ -0,0 +1,122 @@
+# StableSR for Stable Diffusion WebUI
+
+Licensed under S-Lab License 1.0
+
+[![CC BY-NC-SA 4.0][cc-by-nc-sa-shield]][cc-by-nc-sa]
+
+English｜[中文](README_CN.md)
+
+- StableSR is a competitive super-resolution method originally proposed by Jianyi Wang et al.
+- This repository is a migration of the StableSR project to the Automatic1111 WebUI.
+
+Relevant Links
+
+> Click to view high-quality official examples!
+
+- [Project Page](https://iceclear.github.io/projects/stablesr/)
+- [Official Repository](https://github.com/IceClear/StableSR)
+- [Paper on arXiv](https://arxiv.org/abs/2305.07015)
+
+> If you find this project useful, please give me & Jianyi Wang a star! ⭐
+---
+## Usage
+
+### 1. Installation
+
+⚪ Method 1: URL Install
+
+- Open Automatic1111 WebUI -> Click Tab "Extensions" -> Click Tab "Install from URL" -> type in https://github.com/pkuliyi2015/sd-webui-stablesr.git -> Click "Install" 
+
+![installation](https://github.com/pkuliyi2015/multidiffusion-img-demo/blob/master/installation.png?raw=true)
+
+⚪ Method 2: In progress...
+
+> After sucessful installation, you should see "StableSR" in img2img Scripts dropdown list.
+
+### 2. Download the main components
+
+- You MUST use the Stable Diffusion V2.1 512 **EMA** checkpoint (~5.21GB) from StabilityAI
+    - You can download it from [HuggingFace](https://huggingface.co/stabilityai/stable-diffusion-2-1-base)
+    - Put into stable-diffusion-webui/models/Stable-Diffusion/
+- Download the pruned StableSR module (~
+400MB)
+    - Official resources: In Progress
+    - My resources: <[GoogleDrive](https://drive.google.com/file/d/1tWjkZQhfj07sHDR4r9Ta5Fk4iMp1t3Qw/view?usp=sharing)> <[百度网盘-提取码aguq](https://pan.baidu.com/s/1Nq_6ciGgKnTu0W14QcKKWg?pwd=aguq)>
+    - Put into stable-diffusion-webui/extensions/sd-webui-stablesr/models/
+
+### 3. Optional components
+
+- Install [Tiled Diffusion & VAE]((https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111)) extension
+    - The original StableSR easily gets OOM for large images > 512.
+    - For better quality and less VRAM usage, we recommend Tiled Diffusion & VAE.
+- Use the Official VQGAN VAE (~700MB)
+    - Official resources: In Progress
+    - My resources: <[GoogleDrive](https://drive.google.com/file/d/1ARtDMia3_CbwNsGxxGcZ5UP75W4PeIEI/view?usp=share_link)> <[百度网盘-提取码83u9](https://pan.baidu.com/s/1YCYmGBethR9JZ8-eypoIiQ?pwd=83u9)>
+    - Put it in your stable-diffusion-webui/models/VAE
+
+### 4. Extension Usage
+
+- At the top of the WebUI, select the v2-1_512-ema-pruned checkpoint you downloaded.
+- Switch to img2img tag. Find the "Scripts" dropdown at the bottom of the page.
+    - Select the StableSR script.
+    - Click the refresh button and select the StableSR checkpoint you have downloaded.
+    - Choose a scale factor.
+- Upload your image and start generation (can work without prompts).
+
+### 5. Useful Tips
+
+- Euler a sampler is recommended. Steps >= 20.
+- For output image size > 512, we recommend using Tiled Diffusion & VAE, otherwise, the image quality may not be ideal, and the VRAM usage will be huge. 
+- Here are the Tiled Diffusion settings that replicate the official behavior in the paper.
+    - Method = Mixture of Diffusers
+    - Latent tile size = 64, Latent tile overlap = 32
+    - Latent tile batch size as large as possible before Out of Memory.
+    - Upscaler MUST be None.
+- What is "Pure Noise"?
+    - Pure Noise refers to starting from a fully random noise tensor instead of your image. **This is the default behavior in the StableSR paper.**
+    - When enabling it, the script ignores your denoising strength and gives you much more detailed images, but also changes the color & sharpness significantly
+    - When disabling it, the script starts by adding some noise to your image. The result will be not fully detailed, even if you set denoising strength = 1 (but maybe aesthetically good). See [Comparison](https://imgsli.com/MTgwMTMx).
+
+### 6. Important Notice
+
+> Why my results are different from the offical examples?
+
+- It is not your or our fault.
+    - This extension has the same UNet model weights as the StableSR if installed correctly. 
+    - If you install the optional VQVAE, the whole model weights will be the same as the official model with fusion weights=0.
+- However, your result will be **not as good as** the official results, because:
+    - Sampler Difference: 
+        - The official repo does 100 or 200 steps of legacy DDPM sampling with a custom timestep scheduler, and samples without negative prompts.
+        - However, WebUI doesn't offer such a sampler, and it must sample with negative prompts. **This is the main difference.**
+    - VQVAE Decoder Difference: 
+        - The official VQVAE Decoder takes some Encoder features as input. 
+        - However, in practice, I found these features are astonishingly huge for large images. (>10G for 4k images even in float16!) 
+        - Hence, **I removed the CFW component in VAE Decoder**. As this lead to inferior fidelity in details, I will try to add it back later as an option.
+
+---
+## License
+
+This project is licensed under:
+
+- S-Lab License 1.0.
+- [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa], due to the use of the NVIDIA SPADE module.
+
+[![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]
+[cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/
+[cc-by-nc-sa-image]: https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png
+[cc-by-nc-sa-shield]: https://img.shields.io/badge/License-CC%20BY--NC--SA%204.0-lightgrey.svg
+
+### Disclaimer
+
+- All code in this extension is for research purposes only. 
+- The commercial use of the code and checkpoint is **strictly prohibited**.
+
+### Important Notice for Outcome Images
+
+- Please note that the CC BY-NC-SA 4.0 license in the NVIDIA SPADE module also prohibits the commercial use of outcome images. 
+- Jianyi Wang may change the SPADE module to a commercial-friendly one but he is busy.
+- If you wish to *speed up* his process for commercial purposes, please contact him through email: iceclearwjy@gmail.com
+
+## Acknowledgments
+
+I would like to thank Jianyi Wang et al. for the original StableSR method.
diff --git a/README_CN.md b/README_CN.md
@@ -0,0 +1,119 @@
+# StableSR - Stable Diffusion WebUI
+
+S-Lab License 1.0 & [![CC BY-NC-SA 4.0][cc-by-nc-sa-shield]][cc-by-nc-sa]
+
+[English](README.md) | 中文
+
+- StableSR 是原初由 Jianyi Wang 等人提出的具有竞争力的超分辨率方法。
+- 本仓库是将 StableSR 项目迁移到 Automatic1111 WebUI 的迁移工作。
+
+相关链接
+
+> 点击查看高质量官方示例！
+
+- [项目页面](https://iceclear.github.io/projects/stablesr/)
+- [官方仓库](https://github.com/IceClear/StableSR)
+- [arXiv 上的论文](https://arxiv.org/abs/2305.07015)
+
+> 如果你觉得这个项目有用，请给我和 Jianyi Wang 点个赞！⭐
+---
+## 使用
+
+### 1. 安装
+
+⚪ 方法 1: URL 安装
+
+- 打开 Automatic1111 WebUI -> 点击 "扩展" 标签页 -> 点击 "从 URL 安装" 标签页 -> 输入 https://github.com/pkuliyi2015/sd-webui-stablesr.git -> 点击 "安装"
+
+![installation](https://github.com/pkuliyi2015/multidiffusion-img-demo/blob/master/installation.png?raw=true)
+
+⚪ 方法 2: 进行中...
+
+> 安装成功后，你应该能在 img2img 脚本下拉列表中看到 "StableSR"。
+
+### 2. 下载主要组件
+
+- 你必须使用来自 StabilityAI 的 Stable Diffusion V2.1 512 **EMA** 检查点（大约 5.21GB）
+    - 你可以从 [HuggingFace](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) 下载它
+    - 放入 stable-diffusion-webui/models/Stable-Diffusion/
+- 下载剪枝后的 StableSR 模块（大约 400MB）
+    - 官方资源：进行中
+    - 我的资源：<[GoogleDrive](https://drive.google.com/file/d/1tWjkZQhfj07sHDR4r9Ta5Fk4iMp1t3Qw/view?usp=sharing)> <[百度网盘-提取码aguq](https://pan.baidu.com/s/1Nq_6ciGgKnTu0W14QcKKWg?pwd=aguq)>
+    - 放入 stable-diffusion-webui/extensions/sd-webui-stablesr/models/
+
+### 3. 可选组件
+
+- 安装 [Tiled Diffusion & VAE](https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111) 扩展
+    - 原始的 StableSR 对大于 512 的大图像容易出现 OOM。
+    - 为了获得更好的质量和更少的 VRAM 使用，我们建议使用 Tiled Diffusion & VAE。
+- 使用官方 VQGAN VAE（大约 700MB）
+    - 官方资源：进行中
+    - 我的资源：<[GoogleDrive](https://drive.google.com/file/d/1ARtDMia3_CbwNsGxxGcZ5UP75W4PeIEI/view?usp=share_link)> <[百度网盘-提取码83u9](https://pan.baidu.com/s/1YCYmGBethR9JZ8-eypoIiQ?pwd=83u9)>
+    - 将它放在你的 stable-diffusion-webui/models/VAE 中
+
+### 4. 扩展使用
+
+- 在 WebUI 的顶部，选择你下载的 v2-1_512-ema-pruned 检查点。
+- 切换到 img2img 标签。在页面底部找到 "脚本" 下拉列表。
+    - 选择 StableSR 脚本。
+    - 点击刷新按钮并选择你已下载的 StableSR 检查点。
+    - 选择一个比例因子。
+- 上传你的图像并开始生成（无需提示）。
+
+### 5. 有用的提示
+
+- 推荐使用 Euler 采样器。步数 >= 20。
+- 对于输出图像大小 > 512，我们推荐使用 Tiled Diffusion & VAE，否则，图像质量可能不理想，VRAM 使用量会很大。
+- 这里有一些 Tiled Diffusion 设置，可以复制论文中的官方行为。
+    - 方法 = Diffusers 混合
+    - 隐变量瓷砖大小 = 64，隐变量瓷砖重叠 = 32
+    - 隐变量瓷砖批大小尽可能大，避免内存不足。
+    - 上采样器必须为 None。
+- 什么是 "纯噪声"？
+    - 纯噪声指的是从完全随机的噪声张量开始，而不是从你的图像开始。**这是 StableSR 论文中的默认行为。**
+    - 启用时，脚本会忽略你的去噪强度，并给你更详细的图像，但也会显著改变颜色和锐度
+    - 禁用时，脚本会开始添加一些噪声到你的图像。即使你将去噪强度设为 1，结果也不会完全详细（但可能在美感上更好）。参见 [对比](https://imgsli.com/MTgwMTMx)。
+
+### 6. 重要提醒
+
+> 为什么我的结果和官方示例不同？
+
+- 这不是你或我们的错。
+    - 如果正确安装，这个扩展有与 StableSR 相同的 UNet 模型权重。
+    - 如果你安装了可选的 VQVAE，整个模型权重将与融合权重为 0 的官方模型相同。
+- 但是，你的结果将**不如**官方结果，因为：
+    - 采样器差异：
+        -官方仓库进行 100 或 200 步的 legacy DDPM 采样，并使用自定义的时间步调度器，采样时不使用负提示。
+        - 然而，WebUI 不提供这样的采样器，必须带有负提示进行采样。**这是主要的差异。**
+    - VQVAE 解码器差异：
+        - 官方 VQVAE 解码器将一些编码器特征作为输入。
+        - 然而，在实践中，我发现这些特征对于大图像来说非常大。 (>10G 用于 4k 图像，即使是在 float16！)
+        - 因此，**我移除了 VAE 解码器中的 CFW 组件**。由于这导致了对细节的较低保真度，我将尝试将它作为一个选项添加回去。
+
+---
+## 许可
+
+此项目在以下许可下授权：
+
+- S-Lab License 1.0.
+- [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa]，由于使用了 NVIDIA SPADE 模块。
+
+[![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]
+[cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/
+[cc-by-nc-sa-image]: https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png
+[cc-by-nc-sa-shield]: https://img.shields.io/badge/License-CC%20BY--NC--SA%204.0-lightgrey.svg
+
+### 免责声明
+
+- 此扩展中的所有代码仅供研究目的。
+- 代码和检查点的商业用途**严格禁止**。
+
+### 成果图像的重要通知
+
+- 请注意，NVIDIA SPADE 模块中的 CC BY-NC-SA 4.0 许可也禁止使用成果图像进行商业用途。
+- Jianyi Wang 可能会将 SPADE 模块更改为商业友好的一个，但他很忙。
+- 如果你希望*加快*他为商业目的的进程，请通过电子邮件与他联系：iceclearwjy@gmail.com
+
+## 致谢
+
+我要感谢 Jianyi Wang 等人提出的原始 StableSR 方法。