Sketch → Semi-Realism Digital Art LoRA

LoRA for FireRed-Image-Edit-1.1 (Qwen-Image DiT architecture) that converts line sketches into semi-realism digital art in a specific stylistic register.

Style trigger

Use t01nspcstyle in prompts. Example:

turn this sketch into t01nspcstyle digital art, <scene description>

Base model

Transformer / DiT: FireRedTeam/FireRed-Image-Edit-1.1
Text encoder / VAE: Qwen/Qwen-Image
Processor: Qwen/Qwen-Image-Edit (processor/)
Trainer: DiffSynth-Studio (v2.0.9)

Dataset

257 pairs of sketch → digital art. Each caption has the form turn this sketch into t01nspcstyle digital art, <scene description>.

Data columns:

Column	Meaning
`image`	Target (digital art)
`edit_image`	Input (sketch)
`prompt`	Instruction

Training parameters

Parameter	Value	Notes
`lora_rank`	32
`lora_base_model`	`dit`	LoRA adapts the transformer only
`lora_target_modules`	q/k/v proj, mlp.net.2, mod.1	(full list in `training_config.json`)
`learning_rate`	1e-4	AdamW default betas
`num_epochs`	4
`dataset_repeat`	25	→ 6 425 steps per epoch
Total steps	~25 700
`max_pixels`	1 478 656	1216² — divisible by VAE stride
`mixed_precision`	bf16
`use_gradient_checkpointing`	true	required on 80 GB VRAM
`save_steps`	2000	intermediate checkpoints

Hardware

GPU: NVIDIA H100 80GB HBM3 (SXM)
Throughput: ~3.6 s/it
Per-epoch time: ~6h 25m
Peak VRAM: ~67 GB
Peak power: ~670 W

Checkpoints

Intermediate checkpoints are written every 2000 steps. Use the earliest one that looks good — later steps risk style overfit (257 images is small for rank 32).

Inference (DiffSynth-Studio)

from PIL import Image
from diffsynth.pipelines.qwen_image import QwenImagePipeline, ModelConfig

pipe = QwenImagePipeline.from_pretrained(
    torch_dtype="bfloat16",
    device="cuda",
    model_configs=[
        ModelConfig(model_id="FireRedTeam/FireRed-Image-Edit-1.1",
                    origin_file_pattern="transformer/diffusion_pytorch_model*.safetensors"),
        ModelConfig(model_id="Qwen/Qwen-Image",
                    origin_file_pattern="text_encoder/model*.safetensors"),
        ModelConfig(model_id="Qwen/Qwen-Image",
                    origin_file_pattern="vae/diffusion_pytorch_model.safetensors"),
    ],
    tokenizer_config=ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="tokenizer/"),
    processor_config=ModelConfig(model_id="Qwen/Qwen-Image-Edit", origin_file_pattern="processor/"),
)
pipe.load_lora(pipe.dit, "step-XXXX.safetensors", alpha=1.0)

sketch = Image.open("my_sketch.png")
image = pipe(
    prompt="turn this sketch into t01nspcstyle digital art, <scene>",
    edit_image=[sketch],
    num_inference_steps=30,
    cfg_scale=4.0,
)
image.save("out.png")

IMPORTANT: FireRed expects edit_image as a list, even with a single image (edit_image=[img]), not edit_image=img.

License

Apache-2.0, inherited from both FireRed-Image-Edit-1.1 and Qwen-Image.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Zaytron40k/firered-sketch2art-lora

Base model

FireRedTeam/FireRed-Image-Edit-1.1

Adapter

(2)

this model