Spaces:
Runtime error
Runtime error
| ## Usage | |
| 0. Use Diffusers backend. `Execution & Models` -> `Execution backend` | |
| 1. Go into `Compute Settings` | |
| 2. Enable `Compress Model weights with NNCF` options | |
| 3. Restart the WebUI if it's your first time using NNCF. Otherwise, just reload the model. | |
| ### Features | |
| * Uses INT8, halves the model size | |
| Saves 3.4 GB of VRAM with SDXL | |
| * Works in Diffusers backend | |
| ### Disadvantages | |
| * It is Autocast, GPU will still use 16 Bit to run the model and will be slower | |
| * Uses INT8, can break ControlNet | |
| * Using Lora will trigger model reload | |
| * Not implemented in Original backend | |
| * Fused projections are not compatible with NNCF | |
| ## Options | |
| These results compares NNCF 8 bit to 16 bit. | |
| - Model: | |
| Compresses UNet or Transformers part of the model. | |
| This is where the most memory savings happens for Stable Diffusion. | |
| SDXL: 2500 MB~ memory savings. | |
| SD 1.5: 750 MB~ memory savings. | |
| PixArt-XL-2: 600 MB~ memory savings. | |
| - Text Encoder: | |
| Compresses Text Encoder parts of the model. | |
| This is where the most memory savings happens for PixArt. | |
| PixArt-XL-2: 4750 MB~ memory savings. | |
| SDXL: 750 MB~ memory savings. | |
| SD 1.5: 120 MB~ memory savings. | |
| - VAE: | |
| Compresses VAE part of the model. | |
| Memory savings from compressing VAE is pretty small. | |
| SD 1.5 / SDXL / PixArt-XL-2: 75 MB~ memory savings. | |
| - 4 Bit Compression and Quantization: | |
| 4 bit compression modes and quantization can be used with OpenVINO backend. | |
| For more info: https://github.com/vladmandic/automatic/wiki/OpenVINO#quantization |