Ghibli Fine-tuned Stable Diffusion 2.1 Quantization Int4

Version 1.0.0 License MIT Python 3.8+ PyTorch 2.0+ Diffusers 0.20+ OpenVINO 2023.0+

Quantizate from Base Model

Install Dependencies

pip install -q "optimum-intel[openvino,diffusers]" torch transformers diffusers openvino nncf optimum-quanto

Import Libraries

from diffusers import StableDiffusionPipeline, AutoencoderKL, UNet2DConditionModel, PNDMScheduler
from transformers import AutoTokenizer, CLIPTextModel, CLIPTokenizer
from optimum.intel import OVStableDiffusionPipeline
from optimum.intel import OVQuantizer, OVConfig, OVWeightQuantizationConfig
import torch
from nncf import CompressWeightsMode
import os

Load Base Model

model_id = "danhtran2mind/ghibli-fine-tuned-sd-2.1"
device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.float16 if torch.cuda.is_available() else torch.float32

# Load and export the model to OpenVINO format
pipeline = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=dtype)

Export to OpenVINO format without quantization

# Export to OpenVINO format without quantization
ov_pipeline = OVStableDiffusionPipeline.from_pretrained(
    model_id,
    export=True,
    compile=False,
    load_in_8bit=False,  # Explicitly disable 8-bit quantization
    load_in_4bit=False,  # Explicitly disable 4-bit quantization
    torch_dtype=dtype
)

Define INT4 quantization configuration

# Define INT4 quantization configuration
ov_weight_config_int4 = OVWeightQuantizationConfig(
    weight_only=True,
    mode=CompressWeightsMode.INT4_ASYM,  # Use enum for asymmetric INT4
    group_size=64,
    ratio=0.9  # 90% INT4, 10% INT8
)
ov_config_int4 = OVConfig(quantization_config=ov_weight_config_int4)

Processing and Save Quantization Model

# Create Quantization Directory
save_dir_int4 = "ghibli_sd_int4"
os.makedirs(save_dir_int4, exist_ok=True)

quantizer = OVQuantizer.from_pretrained(ov_pipeline, task="stable-diffusion")

# Quantize the model
quantizer.quantize(ov_config=ov_config_int4, save_directory=save_dir_int4)

# Save scheduler and tokenizer
pipeline.scheduler.save_pretrained(save_dir_int4)
pipeline.tokenizer.save_pretrained(save_dir_int4)

Usage

Install Dependencies

pip install -q "optimum-intel[openvino,diffusers]" openvino

Import Libraries

import torch
from optimum.intel import OVStableDiffusionPipeline

device = "cuda" if torch.cuda.is_available() else "cpu"

pipe = OVStableDiffusionPipeline.from_pretrained("danhtran2mind/ghibli-fine-tuned-sd-2.1-int4")
pipe.to(device)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for danhtran2mind/ghibli-fine-tuned-sd-2.1-int4

Collection including danhtran2mind/ghibli-fine-tuned-sd-2.1-int4