diff --git a/README.md b/README.md index 143ecc67..7432a9e9 100644 --- a/README.md +++ b/README.md @@ -4,6 +4,34 @@ A browser interface based on Gradio library for Stable Diffusion. Original script with Gradio UI was written by a kind anonymous user. This is a modification. ![](screenshot.png) + +## Feature showcase + +[Detailed feature showcase with images, art by Greg Rutkowski](https://github.com/AUTOMATIC1111/stable-diffusion-webui-feature-showcase) + +- Original txt2img and img2img modes +- One click install and run script (but you still must install python, git and CUDA) +- Outpainting +- Inpainting +- Prompt matrix +- Stable Diffusion upscale +- Attention +- Loopback +- X/Y plot +- Textual Inversion +- Resizing options +- Sampling method selection +- Interrupt processing at any time +- 4GB videocard support +- Option to use GFPGAN +- Correct seeds for batches +- Prompt length validation +- Generation parameters added as text to PNG +- Tab to view an existing picture's generation parameters +- Settings page +- Running custom code from UI +- Mouseover hints fo most UI elements + ## Installing and running You need [python](https://www.python.org/downloads/windows/) and [git](https://git-scm.com/download/win) @@ -31,11 +59,13 @@ You optionally can use GPFGAN to improve faces, then you'll need to download the #### Troublehooting: -- if your version of Python is not in PATH, edit `webui.bat`, change the line `set PYTHON=python` to say the full path to your python executable: `set PYTHON=B:\soft\Python310\python.exe`. You can do this for python, but not for git. +- According to reports, intallation currently does not work in a directory with spaces in filenames. +- if your version of Python is not in PATH (or if another version is), edit `webui.bat`, change the line `set PYTHON=python` to say the full path to your python executable: `set PYTHON=B:\soft\Python310\python.exe`. You can do this for python, but not for git. - if you get out of memory errors and your videocard has low amount of VRAM (4GB), edit `webui.bat`, change line 5 to from `set COMMANDLINE_ARGS=` to `set COMMANDLINE_ARGS=--medvram` (see below for other possible options) - installer creates python virtual environment, so none of installed modules will affect your system installation of python if you had one prior to installing this. - to prevent the creation of virtual environment and use your system python, edit `webui.bat` replacing `set VENV_DIR=venv` with `set VENV_DIR=`. - webui.bat installs requirements from files `requirements_versions.txt`, which lists versions for modules specifically compatible with Python 3.10.6. If you choose to install for a different version of python, editing `webui.bat` to have `set REQS_FILE=requirements.txt` instead of `set REQS_FILE=requirements_versions.txt` may help (but I still reccomend you to just use the recommended version of python). +- if you feel you broke something and want to reinstall from scratch, delete directories: `venv`, `repositories`. ### Manual instructions Alternatively, if you don't want to run webui.bat, here are instructions for installing @@ -123,216 +153,3 @@ Extra: if you get a green screen instead of generated pictures, you have a card precision floating point numbers. You must use `--precision full --no-half` in addition to other flags, and the model will take much more space in VRAM. - -## Features -The script creates a web UI for Stable Diffusion's txt2img and img2img scripts. Following are features added -that are not in original script. - -### Extras tab -Additional neural network image improvement methods unrelated to stable diffusion. - -#### GFPGAN -Lets you improve faces in pictures using the GFPGAN model. There is a checkbox in every tab to use GFPGAN at 100%, and -also a separate tab that just allows you to use GFPGAN on any picture, with a slider that controls how strongthe effect is. - -![](images/GFPGAN.png) - -#### Real-ESRGAN -Image upscaler. You can choose from multiple models by original author, and specify by how much the image should be upscaled. -Requires `realesrgan` librarty: - -```commandline -pip install realesrgan -``` - -### Sampling method selection -Pick out of multiple sampling methods for txt2img: - -![](images/sampling.png) - -### Prompt matrix -Separate multiple prompts using the `|` character, and the system will produce an image for every combination of them. -For example, if you use `a busy city street in a modern city|illustration|cinematic lighting` prompt, there are four combinations possible (first part of prompt is always kept): - -- `a busy city street in a modern city` -- `a busy city street in a modern city, illustration` -- `a busy city street in a modern city, cinematic lighting` -- `a busy city street in a modern city, illustration, cinematic lighting` - -Four images will be produced, in this order, all with same seed and each with corresponding prompt: -![](images/prompt-matrix.png) - -Another example, this time with 5 prompts and 16 variations: -![](images/prompt_matrix.jpg) - -If you use this feature, batch count will be ignored, because the number of pictures to produce -depends on your prompts, but batch size will still work (generating multiple pictures at the -same time for a small speed boost). - -### Flagging -Click the Flag button under the output section, and generated images will be saved to `log/images` directory, and generation parameters -will be appended to a csv file `log/log.csv` in the `/sd` directory. - -> but every image is saved, why would I need this? - -If you're like me, you experiment a lot with prompts and settings, and only few images are worth saving. You can -just save them using right click in browser, but then you won't be able to reproduce them later because you will not -know what exact prompt created the image. If you use the flag button, generation parameters will be written to csv file, -and you can easily find parameters for an image by searching for its filename. - -### Copy-paste generation parameters -A text output provides generation parameters in an easy to copy-paste form for easy sharing. - -![](images/kopipe.png) - -If you generate multiple pictures, the displayed seed will be the seed of the first one. - -### Correct seeds for batches -If you use a seed of 1000 to generate two batches of two images each, four generated images will have seeds: `1000, 1001, 1002, 1003`. -Previous versions of the UI would produce `1000, x, 1001, x`, where x is an image that can't be generated by any seed. - -### Resizing -There are three options for resizing input images in img2img mode: - -- Just resize - simply resizes source image to target resolution, resulting in incorrect aspect ratio -- Crop and resize - resize source image preserving aspect ratio so that entirety of target resolution is occupied by it, and crop parts that stick out -- Resize and fill - resize source image preserving aspect ratio so that it entirely fits target resolution, and fill empty space by rows/columns from source image - -Example: -![](images/resizing.jpg) - -### Loading -Gradio's loading graphic has a very negative effect on the processing speed of the neural network. -My RTX 3090 makes images about 10% faster when the tab with gradio is not active. By default, the UI -now hides loading progress animation and replaces it with static "Loading..." text, which achieves -the same effect. Use the `--no-progressbar-hiding` commandline option to revert this and show loading animations. - -### Prompt validation -Stable Diffusion has a limit for input text length. If your prompt is too long, you will get a -warning in the text output field, showing which parts of your text were truncated and ignored by the model. - -### Loopback -A checkbox for img2img allowing to automatically feed output image as input for the next batch. Equivalent to -saving output image, and replacing input image with it. Batch count setting controls how many iterations of -this you get. - -Usually, when doing this, you would choose one of many images for the next iteration yourself, so the usefulness -of this feature may be questionable, but I've managed to get some very nice outputs with it that I wasn't abble -to get otherwise. - -Example: (cherrypicked result; original picture by anon) - -![](images/loopback.jpg) - -### Png info -Adds information about generation parameters to PNG as a text chunk. You -can view this information later using any software that supports viewing -PNG chunk info, for example: https://www.nayuki.io/page/png-file-chunk-inspector - -![](images/pnginfo.png) - -### Textual Inversion -Allows you to use pretrained textual inversion embeddings. -See original site for details: https://textual-inversion.github.io/. -I used lstein's repo for training embdedding: https://github.com/lstein/stable-diffusion; if -you want to train your own, I recommend following the guide on his site. - -No additional libraries/repositories are required to use pretrained embeddings. - -To make use of pretrained embeddings, create `embeddings` directory in the root dir of Stable -Diffusion and put your embeddings into it. They must be .pt files about 5Kb in size, each with only -one trained embedding, and the filename (without .pt) will be the term you'd use in prompt -to get that embedding. - -As an example, I trained one for about 5000 steps: https://files.catbox.moe/e2ui6r.pt; it does -not produce very good results, but it does work. Download and rename it to `Usada Pekora.pt`, -and put it into `embeddings` dir and use Usada Pekora in prompt. - -![](images/inversion.png) - -### Settings -A tab with settings, allowing you to use UI to edit more than half of parameters that previously -were commandline. Settings are saved to config.js file. Settings that remain as commandline -options are ones that are required at startup. - -### Attention -Using `()` in prompt increases model's attention to enclosed words, and `[]` decreases it. You can combine -multiple modifiers: - -![](images/attention-3.jpg) - -### SD upscale -Upscale image using RealESRGAN and then go through tiles of the result, improving them with img2img. - -Original idea by: https://github.com/jquesnelle/txt2imghd. This is an independent implementation. - -To use this feature, tick a checkbox in the img2img interface. Original -image will be upscaled to twice the original width and height, while width and height sliders -will specify the size of individual tiles. At the moment this method does not support batch size. - -Rcommended parameters for upscaling: - - Sampling method: Euler a - - Denoising strength: 0.2, can go up to 0.4 if you feel adventureous - -![](images/sd-upscale.jpg) - -### User scripts -If the program is launched with `--allow-code` option, an extra text input field for script code -is available in txt2img interface. It allows you to input python code that will do the work with -image. If this field is not empty, the processing that would happen normally is skipped. - -In code, access parameters from web UI using the `p` variable, and provide outputs for web UI -using the `display(images, seed, info)` function. All globals from script are also accessible. - -As an example, here is a script that draws a chart seen below (and also saves it as `test/gnomeplot/gnome5.png`): - -```python -steps = [4, 8,12,16, 20] -cfg_scales = [5.0,10.0,15.0] - -def cell(x, y, p=p): - p.steps = x - p.cfg_scale = y - return process_images(p).images[0] - -images = [draw_xy_grid( - xs = steps, - ys = cfg_scales, - x_label = lambda x: f'Steps = {x}', - y_label = lambda y: f'CFG = {y}', - cell = cell -)] - -save_image(images[0], 'test/gnomeplot', 'gnome5') -display(images) -``` - -![](images/scripting.jpg) - -A more simple script that would just process the image and output it normally: - -```python -processed = process_images(p) - -print("Seed was: " + str(processed.seed)) - -display(processed.images, processed.seed, processed.info) -``` - -### 4GB videocard support -Optimizations for GPUs with low VRAM. This should make it possible to generate 512x512 images on videocards with 4GB memory. - -`--lowvram` is a reimplementation of optimization idea from by [basujindal](https://github.com/basujindal/stable-diffusion). -Model is separated into modules, and only one module is kept in GPU memory; when another module needs to run, the previous -is removed from GPU memory. The nature of this optimization makes the processing run slower -- about 10 times slower -compared to normal operation on my RTX 3090. - -`--medvram` is another optimization that should reduce VRAM usage significantly by not processing conditional and -unconditional denoising in a same batch. - -This implementation of optimization does not require any modification to original Stable Diffusion code. - -### Inpainting -In img2img tab, draw a mask over a part of image, and that part will be in-painted. - -![](images/inpainting.png) \ No newline at end of file diff --git a/images/GFPGAN.png b/images/GFPGAN.png deleted file mode 100644 index bd42c011..00000000 Binary files a/images/GFPGAN.png and /dev/null differ diff --git a/images/attention-3.jpg b/images/attention-3.jpg deleted file mode 100644 index 7c7ef0d3..00000000 Binary files a/images/attention-3.jpg and /dev/null differ diff --git a/images/inpainting.png b/images/inpainting.png deleted file mode 100644 index 93f71d2b..00000000 Binary files a/images/inpainting.png and /dev/null differ diff --git a/images/inversion.png b/images/inversion.png deleted file mode 100644 index 4105de5e..00000000 Binary files a/images/inversion.png and /dev/null differ diff --git a/images/kopipe.png b/images/kopipe.png deleted file mode 100644 index 105ddd6a..00000000 Binary files a/images/kopipe.png and /dev/null differ diff --git a/images/loopback.jpg b/images/loopback.jpg deleted file mode 100644 index 39602ebe..00000000 Binary files a/images/loopback.jpg and /dev/null differ diff --git a/images/pnginfo.png b/images/pnginfo.png deleted file mode 100644 index b6e0f6d1..00000000 Binary files a/images/pnginfo.png and /dev/null differ diff --git a/images/prompt-matrix.png b/images/prompt-matrix.png deleted file mode 100644 index 99791330..00000000 Binary files a/images/prompt-matrix.png and /dev/null differ diff --git a/images/prompt_matrix.jpg b/images/prompt_matrix.jpg deleted file mode 100644 index 570c8c0e..00000000 Binary files a/images/prompt_matrix.jpg and /dev/null differ diff --git a/images/resizing.jpg b/images/resizing.jpg deleted file mode 100644 index 6ac344c0..00000000 Binary files a/images/resizing.jpg and /dev/null differ diff --git a/images/sampling.png b/images/sampling.png deleted file mode 100644 index 7d62177d..00000000 Binary files a/images/sampling.png and /dev/null differ diff --git a/images/scripting.jpg b/images/scripting.jpg deleted file mode 100644 index 5830ea6d..00000000 Binary files a/images/scripting.jpg and /dev/null differ diff --git a/images/sd-upscale.jpg b/images/sd-upscale.jpg deleted file mode 100644 index f7b4ad9e..00000000 Binary files a/images/sd-upscale.jpg and /dev/null differ diff --git a/scripts/custom_code.py b/scripts/custom_code.py index b359050a..5694f2dd 100644 --- a/scripts/custom_code.py +++ b/scripts/custom_code.py @@ -9,7 +9,8 @@ class Script(scripts.Script): def title(self): return "Custom code" - def enabled(self): + + def show(self, is_img2img): return cmd_opts.allow_code def ui(self, is_img2img): @@ -18,8 +19,7 @@ class Script(scripts.Script): return [code] def run(self, p, code): - if not cmd_opts.allow_code: - return + assert cmd_opts.allow_code, '--allow-code option must be enabled' display_result_data = [[], -1, ""]