Qwen-Image LoRAs
Collection
10 items
โข
Updated
The idea of the "warmup stages" is to sort of pretrain the LoRA weights so that they can be further trained with more refined data, but maybe have a boost in overall knowledge and stuff.
This stage is 256p, which is what Qwen-Image originally pretrained with.
INFO:musubi_tuner.hv_train_network:Load dataset config from /media/xzuyn/Toshiba1/musubi-stuff/dataset_configs/SynthSat-v1.0-warmup-stage1.toml
INFO:musubi_tuner.dataset.image_video_dataset:glob images in /media/xzuyn/NVMe/Datasets/Images/000_SynthSatWarmup
INFO:musubi_tuner.dataset.image_video_dataset:found 26915 images
INFO:musubi_tuner.dataset.config_utils:[Dataset 0]
is_image_dataset: True
resolution: (256, 256)
batch_size: 16
num_repeats: 1
caption_extension: ".txt"
enable_bucket: True
bucket_no_upscale: True
cache_directory: "/media/xzuyn/NVMe/LClones/musubi-tuner/dataset_cache/SynthSat-v1.0-warmup-stage1-256"
debug_dataset: False
image_directory: "/media/xzuyn/NVMe/Datasets/Images/000_SynthSatWarmup"
image_jsonl_file: "None"
fp_latent_window_size: 9
fp_1f_clean_indices: None
fp_1f_target_index: None
fp_1f_no_post: False
flux_kontext_no_resize_control: False
qwen_image_edit_no_resize_control: False
qwen_image_edit_control_resolution: None
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (128, 512), count: 2
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (144, 448), count: 1
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (160, 400), count: 22
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (176, 368), count: 180
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (192, 336), count: 1197
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (208, 304), count: 6783
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (224, 288), count: 7671
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (240, 272), count: 554
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (256, 256), count: 1257
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (272, 240), count: 276
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (288, 224), count: 2652
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (304, 208), count: 5412
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (336, 192), count: 727
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (368, 176), count: 135
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (400, 160), count: 29
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (448, 144), count: 13
INFO:musubi_tuner.dataset.image_video_dataset:bucket: (512, 128), count: 4
INFO:musubi_tuner.dataset.image_video_dataset:total batches: 1691
INFO:musubi_tuner.hv_train_network:preparing accelerator
accelerator device: cuda
INFO:musubi_tuner.hv_train_network:DiT precision: torch.bfloat16, weight precision: None
INFO:musubi_tuner.hv_train_network:Loading DiT model from /media/xzuyn/NVMe/LClones/musubi-tuner/source_models/qwen_image_bf16.safetensors
INFO:musubi_tuner.qwen_image.qwen_image_model:Creating QwenImageTransformer2DModel
INFO:musubi_tuner.qwen_image.qwen_image_model:Loading DiT model from /media/xzuyn/NVMe/LClones/musubi-tuner/source_models/qwen_image_bf16.safetensors, device=cpu
INFO:musubi_tuner.utils.lora_utils:Loading model files: ['/media/xzuyn/NVMe/LClones/musubi-tuner/source_models/qwen_image_bf16.safetensors']
INFO:musubi_tuner.utils.lora_utils:Loading state dict with FP8 optimization. Dtype of weight: None, hook enabled: False
Loading qwen_image_bf16.safetensors: 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 1933/1933 [01:22<00:00, 23.42key/s]
INFO:musubi_tuner.modules.fp8_optimization_utils:Number of optimized Linear layers: 840
INFO:musubi_tuner.modules.fp8_optimization_utils:Number of monkey-patched Linear layers: 840
INFO:musubi_tuner.qwen_image.qwen_image_model:Loaded DiT model from /media/xzuyn/NVMe/LClones/musubi-tuner/source_models/qwen_image_bf16.safetensors, info=<All keys matched successfully>
INFO:musubi_tuner.hv_train_network:enable swap 10 blocks to CPU from device: cuda
QwenModel: Block swap enabled. Swapping 10 blocks out of 60 blocks. Supports backward: True
import network module: networks.lora_qwen_image
INFO:musubi_tuner.networks.lora:create LoRA network. base dim (rank): 16, alpha: 4.0
INFO:musubi_tuner.networks.lora:neuron dropout: p=0.125, rank dropout: p=0.0, module dropout: p=0.0
INFO:musubi_tuner.networks.lora:create LoRA for U-Net/DiT: 840 modules.
INFO:musubi_tuner.networks.lora:enable LoRA for U-Net: 840 modules
QwenModel: Gradient checkpointing enabled. Activation CPU offloading: True
prepare optimizer, data loader etc.
INFO:musubi_tuner.hv_train_network:use came_pytorch.CAME | {'weight_decay': 0.01, 'enable_8bit': True, 'enable_cautious': True, 'enable_stochastic_rounding': True}
==== CAME Modifications ====
- Stochastic Rounding enabled.
- Cautious Masking enabled.
- 8-bit enabled: block_size=2048, min_8bit_size=16384.
==== CAME Modifications ====
override steps. steps for 1 epochs is / ๆๅฎใจใใใฏใพใงใฎในใใใๆฐ: 1691
INFO:musubi_tuner.hv_train_network:preparing fused backward pass stuff
running training / ๅญฆ็ฟ้ๅง
num train items / ๅญฆ็ฟ็ปๅใๅ็ปๆฐ: 26915
num batches per epoch / 1epochใฎใใใๆฐ: 1691
num epochs / epochๆฐ: 1
batch size per device / ใใใใตใคใบ: 16
gradient accumulation steps / ๅพ้
ใๅ่จใใในใใใๆฐ = 1
total optimization steps / ๅญฆ็ฟในใใใๆฐ: 1691
INFO:musubi_tuner.hv_train_network:set DiT model name for metadata: /media/xzuyn/NVMe/LClones/musubi-tuner/source_models/qwen_image_bf16.safetensors
INFO:musubi_tuner.hv_train_network:set VAE model name for metadata: /media/xzuyn/NVMe/LClones/musubi-tuner/source_models/vae_diffusion_pytorch_model.safetensors
Base model
Qwen/Qwen-Image