Sketch β†’ Semi-Realism Digital Art LoRA

LoRA for FireRed-Image-Edit-1.1 (Qwen-Image DiT architecture) that converts line sketches into semi-realism digital art in a specific stylistic register.

Style trigger

Use t01nspcstyle in prompts. Example:

turn this sketch into t01nspcstyle digital art, <scene description>

Base model

Dataset

257 pairs of sketch β†’ digital art. Each caption has the form turn this sketch into t01nspcstyle digital art, <scene description>.

Data columns:

Column Meaning
image Target (digital art)
edit_image Input (sketch)
prompt Instruction

Training parameters

Parameter Value Notes
lora_rank 32
lora_base_model dit LoRA adapts the transformer only
lora_target_modules q/k/v proj, mlp.net.2, mod.1 (full list in training_config.json)
learning_rate 1e-4 AdamW default betas
num_epochs 4
dataset_repeat 25 β†’ 6 425 steps per epoch
Total steps ~25 700
max_pixels 1 478 656 1216Β² β€” divisible by VAE stride
mixed_precision bf16
use_gradient_checkpointing true required on 80 GB VRAM
save_steps 2000 intermediate checkpoints

Hardware

  • GPU: NVIDIA H100 80GB HBM3 (SXM)
  • Throughput: ~3.6 s/it
  • Per-epoch time: ~6h 25m
  • Peak VRAM: ~67 GB
  • Peak power: ~670 W

Checkpoints

Intermediate checkpoints are written every 2000 steps. Use the earliest one that looks good β€” later steps risk style overfit (257 images is small for rank 32).

Inference (DiffSynth-Studio)

from PIL import Image
from diffsynth.pipelines.qwen_image import QwenImagePipeline, ModelConfig

pipe = QwenImagePipeline.from_pretrained(
    torch_dtype="bfloat16",
    device="cuda",
    model_configs=[
        ModelConfig(model_id="FireRedTeam/FireRed-Image-Edit-1.1",
                    origin_file_pattern="transformer/diffusion_pytorch_model*.safetensors"),
        ModelConfig(model_id="Qwen/Qwen-Image",
                    origin_file_pattern="text_encoder/model*.safetensors"),
        ModelConfig(model_id="Qwen/Qwen-Image",
                    origin_file_pattern="vae/diffusion_pytorch_model.safetensors"),
    ],
    tokenizer_config=ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="tokenizer/"),
    processor_config=ModelConfig(model_id="Qwen/Qwen-Image-Edit", origin_file_pattern="processor/"),
)
pipe.load_lora(pipe.dit, "step-XXXX.safetensors", alpha=1.0)

sketch = Image.open("my_sketch.png")
image = pipe(
    prompt="turn this sketch into t01nspcstyle digital art, <scene>",
    edit_image=[sketch],
    num_inference_steps=30,
    cfg_scale=4.0,
)
image.save("out.png")

IMPORTANT: FireRed expects edit_image as a list, even with a single image (edit_image=[img]), not edit_image=img.

License

Apache-2.0, inherited from both FireRed-Image-Edit-1.1 and Qwen-Image.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Zaytron40k/firered-sketch2art-lora

Adapter
(2)
this model