Wan 2.2 Ovi
Hello Kijai, hello Community. I wanted to ask about the new Wan 2.2 Ovi kijai diffusion models that have been uploaded here . How exactly are those to be used? Am I supposed to load them via the Diffusion Model Loader KJ and then just add these special tags in to the prompt and thats it?
It's very much work in progress, currently testable in the ovi -branch of the WanVideoWrapper only.
I see. I am very curious to know if this will be managable with 12GB vram. I tested the ComfyUI-Ovi workflow and yeah...that monster requires at least 24gb.
I see. I am very curious to know if this will be managable with 12GB vram. I tested the ComfyUI-Ovi workflow and yeah...that monster requires at least 24gb.
It should be usable, with fp8 and full block swap it used around 6GB VRAM at 704x704 for 121 frames in my testing.
I tried it but got stuck at loading the wan vae. I redowloaded your Wan2_2_VAE_bf16.vae and it's matching the filename from your examplle workflow, but seems the node is expecting a vae in another format...
I tried it but got stuck at loading the wan vae. I redowloaded your Wan2_2_VAE_bf16.vae and it's matching the filename from your examplle workflow, but seems the node is expecting a vae in another format...
What do you mean exactly, is there an error?
I tried it but got stuck at loading the wan vae. I redowloaded your Wan2_2_VAE_bf16.vae and it's matching the filename from your examplle workflow, but seems the node is expecting a vae in another format...
What do you mean exactly, is there an error?
Yes. this is the error:
Click me!!
WanVideoVAELoader
Error(s) in loading state_dict for WanVideoVAE:
Missing key(s) in state_dict: "model.encoder.downsamples.0.residual.0.gamma", "model.encoder.downsamples.0.residual.2.weight", "model.encoder.downsamples.0.residual.2.bias", "model.encoder.downsamples.0.residual.3.gamma", "model.encoder.downsamples.0.residual.6.weight", "model.encoder.downsamples.0.residual.6.bias", "model.encoder.downsamples.1.residual.0.gamma", "model.encoder.downsamples.1.residual.2.weight", "model.encoder.downsamples.1.residual.2.bias", "model.encoder.downsamples.1.residual.3.gamma", "model.encoder.downsamples.1.residual.6.weight", "model.encoder.downsamples.1.residual.6.bias", "model.encoder.downsamples.2.resample.1.weight", "model.encoder.downsamples.2.resample.1.bias", "model.encoder.downsamples.3.residual.0.gamma", "model.encoder.downsamples.3.residual.2.weight", "model.encoder.downsamples.3.residual.2.bias", "model.encoder.downsamples.3.residual.3.gamma", "model.encoder.downsamples.3.residual.6.weight", "model.encoder.downsamples.3.residual.6.bias", "model.encoder.downsamples.3.shortcut.weight", "model.encoder.downsamples.3.shortcut.bias", "model.encoder.downsamples.4.residual.0.gamma", "model.encoder.downsamples.4.residual.2.weight", "model.encoder.downsamples.4.residual.2.bias", "model.encoder.downsamples.4.residual.3.gamma", "model.encoder.downsamples.4.residual.6.weight", "model.encoder.downsamples.4.residual.6.bias", "model.encoder.downsamples.5.resample.1.weight", "model.encoder.downsamples.5.resample.1.bias", "model.encoder.downsamples.5.time_conv.weight", "model.encoder.downsamples.5.time_conv.bias", "model.encoder.downsamples.6.residual.0.gamma", "model.encoder.downsamples.6.residual.2.weight", "model.encoder.downsamples.6.residual.2.bias", "model.encoder.downsamples.6.residual.3.gamma", "model.encoder.downsamples.6.residual.6.weight", "model.encoder.downsamples.6.residual.6.bias", "model.encoder.downsamples.6.shortcut.weight", "model.encoder.downsamples.6.shortcut.bias", "model.encoder.downsamples.7.residual.0.gamma", "model.encoder.downsamples.7.residual.2.weight", "model.encoder.downsamples.7.residual.2.bias", "model.encoder.downsamples.7.residual.3.gamma", "model.encoder.downsamples.7.residual.6.weight", "model.encoder.downsamples.7.residual.6.bias", "model.encoder.downsamples.8.resample.1.weight", "model.encoder.downsamples.8.resample.1.bias", "model.encoder.downsamples.8.time_conv.weight", "model.encoder.downsamples.8.time_conv.bias", "model.encoder.downsamples.9.residual.0.gamma", "model.encoder.downsamples.9.residual.2.weight", "model.encoder.downsamples.9.residual.2.bias", "model.encoder.downsamples.9.residual.3.gamma", "model.encoder.downsamples.9.residual.6.weight", "model.encoder.downsamples.9.residual.6.bias", "model.encoder.downsamples.10.residual.0.gamma", "model.encoder.downsamples.10.residual.2.weight", "model.encoder.downsamples.10.residual.2.bias", "model.encoder.downsamples.10.residual.3.gamma", "model.encoder.downsamples.10.residual.6.weight", "model.encoder.downsamples.10.residual.6.bias", "model.decoder.upsamples.0.residual.0.gamma", "model.decoder.upsamples.0.residual.2.weight", "model.decoder.upsamples.0.residual.2.bias", "model.decoder.upsamples.0.residual.3.gamma", "model.decoder.upsamples.0.residual.6.weight", "model.decoder.upsamples.0.residual.6.bias", "model.decoder.upsamples.1.residual.0.gamma", "model.decoder.upsamples.1.residual.2.weight", "model.decoder.upsamples.1.residual.2.bias", "model.decoder.upsamples.1.residual.3.gamma", "model.decoder.upsamples.1.residual.6.weight", "model.decoder.upsamples.1.residual.6.bias", "model.decoder.upsamples.2.residual.0.gamma", "model.decoder.upsamples.2.residual.2.weight", "model.decoder.upsamples.2.residual.2.bias", "model.decoder.upsamples.2.residual.3.gamma", "model.decoder.upsamples.2.residual.6.weight", "model.decoder.upsamples.2.residual.6.bias", "model.decoder.upsamples.3.resample.1.weight", "model.decoder.upsamples.3.resample.1.bias", "model.decoder.upsamples.3.time_conv.weight", "model.decoder.upsamples.3.time_conv.bias", "model.decoder.upsamples.4.residual.0.gamma", "model.decoder.upsamples.4.residual.2.weight", "model.decoder.upsamples.4.residual.2.bias", "model.decoder.upsamples.4.residual.3.gamma", "model.decoder.upsamples.4.residual.6.weight", "model.decoder.upsamples.4.residual.6.bias", "model.decoder.upsamples.4.shortcut.weight", "model.decoder.upsamples.4.shortcut.bias", "model.decoder.upsamples.5.residual.0.gamma", "model.decoder.upsamples.5.residual.2.weight", "model.decoder.upsamples.5.residual.2.bias", "model.decoder.upsamples.5.residual.3.gamma", "model.decoder.upsamples.5.residual.6.weight", "model.decoder.upsamples.5.residual.6.bias", "model.decoder.upsamples.6.residual.0.gamma", "model.decoder.upsamples.6.residual.2.weight", "model.decoder.upsamples.6.residual.2.bias", "model.decoder.upsamples.6.residual.3.gamma", "model.decoder.upsamples.6.residual.6.weight", "model.decoder.upsamples.6.residual.6.bias", "model.decoder.upsamples.7.resample.1.weight", "model.decoder.upsamples.7.resample.1.bias", "model.decoder.upsamples.7.time_conv.weight", "model.decoder.upsamples.7.time_conv.bias", "model.decoder.upsamples.8.residual.0.gamma", "model.decoder.upsamples.8.residual.2.weight", "model.decoder.upsamples.8.residual.2.bias", "model.decoder.upsamples.8.residual.3.gamma", "model.decoder.upsamples.8.residual.6.weight", "model.decoder.upsamples.8.residual.6.bias", "model.decoder.upsamples.9.residual.0.gamma", "model.decoder.upsamples.9.residual.2.weight", "model.decoder.upsamples.9.residual.2.bias", "model.decoder.upsamples.9.residual.3.gamma", "model.decoder.upsamples.9.residual.6.weight", "model.decoder.upsamples.9.residual.6.bias", "model.decoder.upsamples.10.residual.0.gamma", "model.decoder.upsamples.10.residual.2.weight", "model.decoder.upsamples.10.residual.2.bias", "model.decoder.upsamples.10.residual.3.gamma", "model.decoder.upsamples.10.residual.6.weight", "model.decoder.upsamples.10.residual.6.bias", "model.decoder.upsamples.11.resample.1.weight", "model.decoder.upsamples.11.resample.1.bias", "model.decoder.upsamples.12.residual.0.gamma", "model.decoder.upsamples.12.residual.2.weight", "model.decoder.upsamples.12.residual.2.bias", "model.decoder.upsamples.12.residual.3.gamma", "model.decoder.upsamples.12.residual.6.weight", "model.decoder.upsamples.12.residual.6.bias", "model.decoder.upsamples.13.residual.0.gamma", "model.decoder.upsamples.13.residual.2.weight", "model.decoder.upsamples.13.residual.2.bias", "model.decoder.upsamples.13.residual.3.gamma", "model.decoder.upsamples.13.residual.6.weight", "model.decoder.upsamples.13.residual.6.bias", "model.decoder.upsamples.14.residual.0.gamma", "model.decoder.upsamples.14.residual.2.weight", "model.decoder.upsamples.14.residual.2.bias", "model.decoder.upsamples.14.residual.3.gamma", "model.decoder.upsamples.14.residual.6.weight", "model.decoder.upsamples.14.residual.6.bias".
Unexpected key(s) in state_dict: "model.encoder.downsamples.0.downsamples.0.residual.0.gamma", "model.encoder.downsamples.0.downsamples.0.residual.2.bias", "model.encoder.downsamples.0.downsamples.0.residual.2.weight", "model.encoder.downsamples.0.downsamples.0.residual.3.gamma", "model.encoder.downsamples.0.downsamples.0.residual.6.bias", "model.encoder.downsamples.0.downsamples.0.residual.6.weight", "model.encoder.downsamples.0.downsamples.1.residual.0.gamma", "model.encoder.downsamples.0.downsamples.1.residual.2.bias", "model.encoder.downsamples.0.downsamples.1.residual.2.weight", "model.encoder.downsamples.0.downsamples.1.residual.3.gamma", "model.encoder.downsamples.0.downsamples.1.residual.6.bias", "model.encoder.downsamples.0.downsamples.1.residual.6.weight", "model.encoder.downsamples.0.downsamples.2.resample.1.bias", "model.encoder.downsamples.0.downsamples.2.resample.1.weight", "model.encoder.downsamples.1.downsamples.0.residual.0.gamma", "model.encoder.downsamples.1.downsamples.0.residual.2.bias", "model.encoder.downsamples.1.downsamples.0.residual.2.weight", "model.encoder.downsamples.1.downsamples.0.residual.3.gamma", "model.encoder.downsamples.1.downsamples.0.residual.6.bias", "model.encoder.downsamples.1.downsamples.0.residual.6.weight", "model.encoder.downsamples.1.downsamples.0.shortcut.bias", "model.encoder.downsamples.1.downsamples.0.shortcut.weight", "model.encoder.downsamples.1.downsamples.1.residual.0.gamma", "model.encoder.downsamples.1.downsamples.1.residual.2.bias", "model.encoder.downsamples.1.downsamples.1.residual.2.weight", "model.encoder.downsamples.1.downsamples.1.residual.3.gamma", "model.encoder.downsamples.1.downsamples.1.residual.6.bias", "model.encoder.downsamples.1.downsamples.1.residual.6.weight", "model.encoder.downsamples.1.downsamples.2.resample.1.bias", "model.encoder.downsamples.1.downsamples.2.resample.1.weight", "model.encoder.downsamples.1.downsamples.2.time_conv.bias", "model.encoder.downsamples.1.downsamples.2.time_conv.weight", "model.encoder.downsamples.2.downsamples.0.residual.0.gamma", "model.encoder.downsamples.2.downsamples.0.residual.2.bias", "model.encoder.downsamples.2.downsamples.0.residual.2.weight", "model.encoder.downsamples.2.downsamples.0.residual.3.gamma", "model.encoder.downsamples.2.downsamples.0.residual.6.bias", "model.encoder.downsamples.2.downsamples.0.residual.6.weight", "model.encoder.downsamples.2.downsamples.0.shortcut.bias", "model.encoder.downsamples.2.downsamples.0.shortcut.weight", "model.encoder.downsamples.2.downsamples.1.residual.0.gamma", "model.encoder.downsamples.2.downsamples.1.residual.2.bias", "model.encoder.downsamples.2.downsamples.1.residual.2.weight", "model.encoder.downsamples.2.downsamples.1.residual.3.gamma", "model.encoder.downsamples.2.downsamples.1.residual.6.bias", "model.encoder.downsamples.2.downsamples.1.residual.6.weight", "model.encoder.downsamples.2.downsamples.2.resample.1.bias", "model.encoder.downsamples.2.downsamples.2.resample.1.weight", "model.encoder.downsamples.2.downsamples.2.time_conv.bias", "model.encoder.downsamples.2.downsamples.2.time_conv.weight", "model.encoder.downsamples.3.downsamples.0.residual.0.gamma", "model.encoder.downsamples.3.downsamples.0.residual.2.bias", "model.encoder.downsamples.3.downsamples.0.residual.2.weight", "model.encoder.downsamples.3.downsamples.0.residual.3.gamma", "model.encoder.downsamples.3.downsamples.0.residual.6.bias", "model.encoder.downsamples.3.downsamples.0.residual.6.weight", "model.encoder.downsamples.3.downsamples.1.residual.0.gamma", "model.encoder.downsamples.3.downsamples.1.residual.2.bias", "model.encoder.downsamples.3.downsamples.1.residual.2.weight", "model.encoder.downsamples.3.downsamples.1.residual.3.gamma", "model.encoder.downsamples.3.downsamples.1.residual.6.bias", "model.encoder.downsamples.3.downsamples.1.residual.6.weight", "model.decoder.upsamples.0.upsamples.0.residual.0.gamma", "model.decoder.upsamples.0.upsamples.0.residual.2.bias", "model.decoder.upsamples.0.upsamples.0.residual.2.weight", "model.decoder.upsamples.0.upsamples.0.residual.3.gamma", "model.decoder.upsamples.0.upsamples.0.residual.6.bias", "model.decoder.upsamples.0.upsamples.0.residual.6.weight", "model.decoder.upsamples.0.upsamples.1.residual.0.gamma", "model.decoder.upsamples.0.upsamples.1.residual.2.bias", "model.decoder.upsamples.0.upsamples.1.residual.2.weight", "model.decoder.upsamples.0.upsamples.1.residual.3.gamma", "model.decoder.upsamples.0.upsamples.1.residual.6.bias", "model.decoder.upsamples.0.upsamples.1.residual.6.weight", "model.decoder.upsamples.0.upsamples.2.residual.0.gamma", "model.decoder.upsamples.0.upsamples.2.residual.2.bias", "model.decoder.upsamples.0.upsamples.2.residual.2.weight", "model.decoder.upsamples.0.upsamples.2.residual.3.gamma", "model.decoder.upsamples.0.upsamples.2.residual.6.bias", "model.decoder.upsamples.0.upsamples.2.residual.6.weight", "model.decoder.upsamples.0.upsamples.3.resample.1.bias", "model.decoder.upsamples.0.upsamples.3.resample.1.weight", "model.decoder.upsamples.0.upsamples.3.time_conv.bias", "model.decoder.upsamples.0.upsamples.3.time_conv.weight", "model.decoder.upsamples.1.upsamples.0.residual.0.gamma", "model.decoder.upsamples.1.upsamples.0.residual.2.bias", "model.decoder.upsamples.1.upsamples.0.residual.2.weight", "model.decoder.upsamples.1.upsamples.0.residual.3.gamma", "model.decoder.upsamples.1.upsamples.0.residual.6.bias", "model.decoder.upsamples.1.upsamples.0.residual.6.weight", "model.decoder.upsamples.1.upsamples.1.residual.0.gamma", "model.decoder.upsamples.1.upsamples.1.residual.2.bias", "model.decoder.upsamples.1.upsamples.1.residual.2.weight", "model.decoder.upsamples.1.upsamples.1.residual.3.gamma", "model.decoder.upsamples.1.upsamples.1.residual.6.bias", "model.decoder.upsamples.1.upsamples.1.residual.6.weight", "model.decoder.upsamples.1.upsamples.2.residual.0.gamma", "model.decoder.upsamples.1.upsamples.2.residual.2.bias", "model.decoder.upsamples.1.upsamples.2.residual.2.weight", "model.decoder.upsamples.1.upsamples.2.residual.3.gamma", "model.decoder.upsamples.1.upsamples.2.residual.6.bias", "model.decoder.upsamples.1.upsamples.2.residual.6.weight", "model.decoder.upsamples.1.upsamples.3.resample.1.bias", "model.decoder.upsamples.1.upsamples.3.resample.1.weight", "model.decoder.upsamples.1.upsamples.3.time_conv.bias", "model.decoder.upsamples.1.upsamples.3.time_conv.weight", "model.decoder.upsamples.2.upsamples.0.residual.0.gamma", "model.decoder.upsamples.2.upsamples.0.residual.2.bias", "model.decoder.upsamples.2.upsamples.0.residual.2.weight", "model.decoder.upsamples.2.upsamples.0.residual.3.gamma", "model.decoder.upsamples.2.upsamples.0.residual.6.bias", "model.decoder.upsamples.2.upsamples.0.residual.6.weight", "model.decoder.upsamples.2.upsamples.0.shortcut.bias", "model.decoder.upsamples.2.upsamples.0.shortcut.weight", "model.decoder.upsamples.2.upsamples.1.residual.0.gamma", "model.decoder.upsamples.2.upsamples.1.residual.2.bias", "model.decoder.upsamples.2.upsamples.1.residual.2.weight", "model.decoder.upsamples.2.upsamples.1.residual.3.gamma", "model.decoder.upsamples.2.upsamples.1.residual.6.bias", "model.decoder.upsamples.2.upsamples.1.residual.6.weight", "model.decoder.upsamples.2.upsamples.2.residual.0.gamma", "model.decoder.upsamples.2.upsamples.2.residual.2.bias", "model.decoder.upsamples.2.upsamples.2.residual.2.weight", "model.decoder.upsamples.2.upsamples.2.residual.3.gamma", "model.decoder.upsamples.2.upsamples.2.residual.6.bias", "model.decoder.upsamples.2.upsamples.2.residual.6.weight", "model.decoder.upsamples.2.upsamples.3.resample.1.bias", "model.decoder.upsamples.2.upsamples.3.resample.1.weight", "model.decoder.upsamples.3.upsamples.0.residual.0.gamma", "model.decoder.upsamples.3.upsamples.0.residual.2.bias", "model.decoder.upsamples.3.upsamples.0.residual.2.weight", "model.decoder.upsamples.3.upsamples.0.residual.3.gamma", "model.decoder.upsamples.3.upsamples.0.residual.6.bias", "model.decoder.upsamples.3.upsamples.0.residual.6.weight", "model.decoder.upsamples.3.upsamples.0.shortcut.bias", "model.decoder.upsamples.3.upsamples.0.shortcut.weight", "model.decoder.upsamples.3.upsamples.1.residual.0.gamma", "model.decoder.upsamples.3.upsamples.1.residual.2.bias", "model.decoder.upsamples.3.upsamples.1.residual.2.weight", "model.decoder.upsamples.3.upsamples.1.residual.3.gamma", "model.decoder.upsamples.3.upsamples.1.residual.6.bias", "model.decoder.upsamples.3.upsamples.1.residual.6.weight", "model.decoder.upsamples.3.upsamples.2.residual.0.gamma", "model.decoder.upsamples.3.upsamples.2.residual.2.bias", "model.decoder.upsamples.3.upsamples.2.residual.2.weight", "model.decoder.upsamples.3.upsamples.2.residual.3.gamma", "model.decoder.upsamples.3.upsamples.2.residual.6.bias", "model.decoder.upsamples.3.upsamples.2.residual.6.weight".
size mismatch for model.encoder.conv1.weight: copying a param with shape torch.Size([160, 12, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([96, 3, 3, 3, 3]).
size mismatch for model.encoder.conv1.bias: copying a param with shape torch.Size([160]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for model.encoder.middle.0.residual.0.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.middle.0.residual.2.weight: copying a param with shape torch.Size([640, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.encoder.middle.0.residual.2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.middle.0.residual.3.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.middle.0.residual.6.weight: copying a param with shape torch.Size([640, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.encoder.middle.0.residual.6.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.middle.1.norm.gamma: copying a param with shape torch.Size([640, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1]).
size mismatch for model.encoder.middle.1.to_qkv.weight: copying a param with shape torch.Size([1920, 640, 1, 1]) from checkpoint, the shape in current model is torch.Size([1152, 384, 1, 1]).
size mismatch for model.encoder.middle.1.to_qkv.bias: copying a param with shape torch.Size([1920]) from checkpoint, the shape in current model is torch.Size([1152]).
size mismatch for model.encoder.middle.1.proj.weight: copying a param with shape torch.Size([640, 640, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 384, 1, 1]).
size mismatch for model.encoder.middle.1.proj.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.middle.2.residual.0.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.middle.2.residual.2.weight: copying a param with shape torch.Size([640, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.encoder.middle.2.residual.2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.middle.2.residual.3.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.middle.2.residual.6.weight: copying a param with shape torch.Size([640, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.encoder.middle.2.residual.6.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.head.0.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.head.2.weight: copying a param with shape torch.Size([96, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 384, 3, 3, 3]).
size mismatch for model.encoder.head.2.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for model.conv1.weight: copying a param with shape torch.Size([96, 96, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 1, 1, 1]).
size mismatch for model.conv1.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for model.conv2.weight: copying a param with shape torch.Size([48, 48, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 1, 1, 1]).
size mismatch for model.conv2.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for model.decoder.conv1.weight: copying a param with shape torch.Size([1024, 48, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 16, 3, 3, 3]).
size mismatch for model.decoder.conv1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.0.residual.0.gamma: copying a param with shape torch.Size([1024, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.decoder.middle.0.residual.2.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.decoder.middle.0.residual.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.0.residual.3.gamma: copying a param with shape torch.Size([1024, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.decoder.middle.0.residual.6.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.decoder.middle.0.residual.6.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.1.norm.gamma: copying a param with shape torch.Size([1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1]).
size mismatch for model.decoder.middle.1.to_qkv.weight: copying a param with shape torch.Size([3072, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([1152, 384, 1, 1]).
size mismatch for model.decoder.middle.1.to_qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1152]).
size mismatch for model.decoder.middle.1.proj.weight: copying a param with shape torch.Size([1024, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 384, 1, 1]).
size mismatch for model.decoder.middle.1.proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.2.residual.0.gamma: copying a param with shape torch.Size([1024, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.decoder.middle.2.residual.2.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.decoder.middle.2.residual.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.2.residual.3.gamma: copying a param with shape torch.Size([1024, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.decoder.middle.2.residual.6.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.decoder.middle.2.residual.6.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.head.0.gamma: copying a param with shape torch.Size([256, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([96, 1, 1, 1]).
size mismatch for model.decoder.head.2.weight: copying a param with shape torch.Size([12, 256, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 96, 3, 3, 3]).
size mismatch for model.decoder.head.2.bias: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([3]).
I tried it but got stuck at loading the wan vae. I redowloaded your Wan2_2_VAE_bf16.vae and it's matching the filename from your examplle workflow, but seems the node is expecting a vae in another format...
What do you mean exactly, is there an error?
Yes. this is the error:
Click me!!
WanVideoVAELoader
Error(s) in loading state_dict for WanVideoVAE:
Missing key(s) in state_dict: "model.encoder.downsamples.0.residual.0.gamma", "model.encoder.downsamples.0.residual.2.weight", "model.encoder.downsamples.0.residual.2.bias", "model.encoder.downsamples.0.residual.3.gamma", "model.encoder.downsamples.0.residual.6.weight", "model.encoder.downsamples.0.residual.6.bias", "model.encoder.downsamples.1.residual.0.gamma", "model.encoder.downsamples.1.residual.2.weight", "model.encoder.downsamples.1.residual.2.bias", "model.encoder.downsamples.1.residual.3.gamma", "model.encoder.downsamples.1.residual.6.weight", "model.encoder.downsamples.1.residual.6.bias", "model.encoder.downsamples.2.resample.1.weight", "model.encoder.downsamples.2.resample.1.bias", "model.encoder.downsamples.3.residual.0.gamma", "model.encoder.downsamples.3.residual.2.weight", "model.encoder.downsamples.3.residual.2.bias", "model.encoder.downsamples.3.residual.3.gamma", "model.encoder.downsamples.3.residual.6.weight", "model.encoder.downsamples.3.residual.6.bias", "model.encoder.downsamples.3.shortcut.weight", "model.encoder.downsamples.3.shortcut.bias", "model.encoder.downsamples.4.residual.0.gamma", "model.encoder.downsamples.4.residual.2.weight", "model.encoder.downsamples.4.residual.2.bias", "model.encoder.downsamples.4.residual.3.gamma", "model.encoder.downsamples.4.residual.6.weight", "model.encoder.downsamples.4.residual.6.bias", "model.encoder.downsamples.5.resample.1.weight", "model.encoder.downsamples.5.resample.1.bias", "model.encoder.downsamples.5.time_conv.weight", "model.encoder.downsamples.5.time_conv.bias", "model.encoder.downsamples.6.residual.0.gamma", "model.encoder.downsamples.6.residual.2.weight", "model.encoder.downsamples.6.residual.2.bias", "model.encoder.downsamples.6.residual.3.gamma", "model.encoder.downsamples.6.residual.6.weight", "model.encoder.downsamples.6.residual.6.bias", "model.encoder.downsamples.6.shortcut.weight", "model.encoder.downsamples.6.shortcut.bias", "model.encoder.downsamples.7.residual.0.gamma", "model.encoder.downsamples.7.residual.2.weight", "model.encoder.downsamples.7.residual.2.bias", "model.encoder.downsamples.7.residual.3.gamma", "model.encoder.downsamples.7.residual.6.weight", "model.encoder.downsamples.7.residual.6.bias", "model.encoder.downsamples.8.resample.1.weight", "model.encoder.downsamples.8.resample.1.bias", "model.encoder.downsamples.8.time_conv.weight", "model.encoder.downsamples.8.time_conv.bias", "model.encoder.downsamples.9.residual.0.gamma", "model.encoder.downsamples.9.residual.2.weight", "model.encoder.downsamples.9.residual.2.bias", "model.encoder.downsamples.9.residual.3.gamma", "model.encoder.downsamples.9.residual.6.weight", "model.encoder.downsamples.9.residual.6.bias", "model.encoder.downsamples.10.residual.0.gamma", "model.encoder.downsamples.10.residual.2.weight", "model.encoder.downsamples.10.residual.2.bias", "model.encoder.downsamples.10.residual.3.gamma", "model.encoder.downsamples.10.residual.6.weight", "model.encoder.downsamples.10.residual.6.bias", "model.decoder.upsamples.0.residual.0.gamma", "model.decoder.upsamples.0.residual.2.weight", "model.decoder.upsamples.0.residual.2.bias", "model.decoder.upsamples.0.residual.3.gamma", "model.decoder.upsamples.0.residual.6.weight", "model.decoder.upsamples.0.residual.6.bias", "model.decoder.upsamples.1.residual.0.gamma", "model.decoder.upsamples.1.residual.2.weight", "model.decoder.upsamples.1.residual.2.bias", "model.decoder.upsamples.1.residual.3.gamma", "model.decoder.upsamples.1.residual.6.weight", "model.decoder.upsamples.1.residual.6.bias", "model.decoder.upsamples.2.residual.0.gamma", "model.decoder.upsamples.2.residual.2.weight", "model.decoder.upsamples.2.residual.2.bias", "model.decoder.upsamples.2.residual.3.gamma", "model.decoder.upsamples.2.residual.6.weight", "model.decoder.upsamples.2.residual.6.bias", "model.decoder.upsamples.3.resample.1.weight", "model.decoder.upsamples.3.resample.1.bias", "model.decoder.upsamples.3.time_conv.weight", "model.decoder.upsamples.3.time_conv.bias", "model.decoder.upsamples.4.residual.0.gamma", "model.decoder.upsamples.4.residual.2.weight", "model.decoder.upsamples.4.residual.2.bias", "model.decoder.upsamples.4.residual.3.gamma", "model.decoder.upsamples.4.residual.6.weight", "model.decoder.upsamples.4.residual.6.bias", "model.decoder.upsamples.4.shortcut.weight", "model.decoder.upsamples.4.shortcut.bias", "model.decoder.upsamples.5.residual.0.gamma", "model.decoder.upsamples.5.residual.2.weight", "model.decoder.upsamples.5.residual.2.bias", "model.decoder.upsamples.5.residual.3.gamma", "model.decoder.upsamples.5.residual.6.weight", "model.decoder.upsamples.5.residual.6.bias", "model.decoder.upsamples.6.residual.0.gamma", "model.decoder.upsamples.6.residual.2.weight", "model.decoder.upsamples.6.residual.2.bias", "model.decoder.upsamples.6.residual.3.gamma", "model.decoder.upsamples.6.residual.6.weight", "model.decoder.upsamples.6.residual.6.bias", "model.decoder.upsamples.7.resample.1.weight", "model.decoder.upsamples.7.resample.1.bias", "model.decoder.upsamples.7.time_conv.weight", "model.decoder.upsamples.7.time_conv.bias", "model.decoder.upsamples.8.residual.0.gamma", "model.decoder.upsamples.8.residual.2.weight", "model.decoder.upsamples.8.residual.2.bias", "model.decoder.upsamples.8.residual.3.gamma", "model.decoder.upsamples.8.residual.6.weight", "model.decoder.upsamples.8.residual.6.bias", "model.decoder.upsamples.9.residual.0.gamma", "model.decoder.upsamples.9.residual.2.weight", "model.decoder.upsamples.9.residual.2.bias", "model.decoder.upsamples.9.residual.3.gamma", "model.decoder.upsamples.9.residual.6.weight", "model.decoder.upsamples.9.residual.6.bias", "model.decoder.upsamples.10.residual.0.gamma", "model.decoder.upsamples.10.residual.2.weight", "model.decoder.upsamples.10.residual.2.bias", "model.decoder.upsamples.10.residual.3.gamma", "model.decoder.upsamples.10.residual.6.weight", "model.decoder.upsamples.10.residual.6.bias", "model.decoder.upsamples.11.resample.1.weight", "model.decoder.upsamples.11.resample.1.bias", "model.decoder.upsamples.12.residual.0.gamma", "model.decoder.upsamples.12.residual.2.weight", "model.decoder.upsamples.12.residual.2.bias", "model.decoder.upsamples.12.residual.3.gamma", "model.decoder.upsamples.12.residual.6.weight", "model.decoder.upsamples.12.residual.6.bias", "model.decoder.upsamples.13.residual.0.gamma", "model.decoder.upsamples.13.residual.2.weight", "model.decoder.upsamples.13.residual.2.bias", "model.decoder.upsamples.13.residual.3.gamma", "model.decoder.upsamples.13.residual.6.weight", "model.decoder.upsamples.13.residual.6.bias", "model.decoder.upsamples.14.residual.0.gamma", "model.decoder.upsamples.14.residual.2.weight", "model.decoder.upsamples.14.residual.2.bias", "model.decoder.upsamples.14.residual.3.gamma", "model.decoder.upsamples.14.residual.6.weight", "model.decoder.upsamples.14.residual.6.bias".
Unexpected key(s) in state_dict: "model.encoder.downsamples.0.downsamples.0.residual.0.gamma", "model.encoder.downsamples.0.downsamples.0.residual.2.bias", "model.encoder.downsamples.0.downsamples.0.residual.2.weight", "model.encoder.downsamples.0.downsamples.0.residual.3.gamma", "model.encoder.downsamples.0.downsamples.0.residual.6.bias", "model.encoder.downsamples.0.downsamples.0.residual.6.weight", "model.encoder.downsamples.0.downsamples.1.residual.0.gamma", "model.encoder.downsamples.0.downsamples.1.residual.2.bias", "model.encoder.downsamples.0.downsamples.1.residual.2.weight", "model.encoder.downsamples.0.downsamples.1.residual.3.gamma", "model.encoder.downsamples.0.downsamples.1.residual.6.bias", "model.encoder.downsamples.0.downsamples.1.residual.6.weight", "model.encoder.downsamples.0.downsamples.2.resample.1.bias", "model.encoder.downsamples.0.downsamples.2.resample.1.weight", "model.encoder.downsamples.1.downsamples.0.residual.0.gamma", "model.encoder.downsamples.1.downsamples.0.residual.2.bias", "model.encoder.downsamples.1.downsamples.0.residual.2.weight", "model.encoder.downsamples.1.downsamples.0.residual.3.gamma", "model.encoder.downsamples.1.downsamples.0.residual.6.bias", "model.encoder.downsamples.1.downsamples.0.residual.6.weight", "model.encoder.downsamples.1.downsamples.0.shortcut.bias", "model.encoder.downsamples.1.downsamples.0.shortcut.weight", "model.encoder.downsamples.1.downsamples.1.residual.0.gamma", "model.encoder.downsamples.1.downsamples.1.residual.2.bias", "model.encoder.downsamples.1.downsamples.1.residual.2.weight", "model.encoder.downsamples.1.downsamples.1.residual.3.gamma", "model.encoder.downsamples.1.downsamples.1.residual.6.bias", "model.encoder.downsamples.1.downsamples.1.residual.6.weight", "model.encoder.downsamples.1.downsamples.2.resample.1.bias", "model.encoder.downsamples.1.downsamples.2.resample.1.weight", "model.encoder.downsamples.1.downsamples.2.time_conv.bias", "model.encoder.downsamples.1.downsamples.2.time_conv.weight", "model.encoder.downsamples.2.downsamples.0.residual.0.gamma", "model.encoder.downsamples.2.downsamples.0.residual.2.bias", "model.encoder.downsamples.2.downsamples.0.residual.2.weight", "model.encoder.downsamples.2.downsamples.0.residual.3.gamma", "model.encoder.downsamples.2.downsamples.0.residual.6.bias", "model.encoder.downsamples.2.downsamples.0.residual.6.weight", "model.encoder.downsamples.2.downsamples.0.shortcut.bias", "model.encoder.downsamples.2.downsamples.0.shortcut.weight", "model.encoder.downsamples.2.downsamples.1.residual.0.gamma", "model.encoder.downsamples.2.downsamples.1.residual.2.bias", "model.encoder.downsamples.2.downsamples.1.residual.2.weight", "model.encoder.downsamples.2.downsamples.1.residual.3.gamma", "model.encoder.downsamples.2.downsamples.1.residual.6.bias", "model.encoder.downsamples.2.downsamples.1.residual.6.weight", "model.encoder.downsamples.2.downsamples.2.resample.1.bias", "model.encoder.downsamples.2.downsamples.2.resample.1.weight", "model.encoder.downsamples.2.downsamples.2.time_conv.bias", "model.encoder.downsamples.2.downsamples.2.time_conv.weight", "model.encoder.downsamples.3.downsamples.0.residual.0.gamma", "model.encoder.downsamples.3.downsamples.0.residual.2.bias", "model.encoder.downsamples.3.downsamples.0.residual.2.weight", "model.encoder.downsamples.3.downsamples.0.residual.3.gamma", "model.encoder.downsamples.3.downsamples.0.residual.6.bias", "model.encoder.downsamples.3.downsamples.0.residual.6.weight", "model.encoder.downsamples.3.downsamples.1.residual.0.gamma", "model.encoder.downsamples.3.downsamples.1.residual.2.bias", "model.encoder.downsamples.3.downsamples.1.residual.2.weight", "model.encoder.downsamples.3.downsamples.1.residual.3.gamma", "model.encoder.downsamples.3.downsamples.1.residual.6.bias", "model.encoder.downsamples.3.downsamples.1.residual.6.weight", "model.decoder.upsamples.0.upsamples.0.residual.0.gamma", "model.decoder.upsamples.0.upsamples.0.residual.2.bias", "model.decoder.upsamples.0.upsamples.0.residual.2.weight", "model.decoder.upsamples.0.upsamples.0.residual.3.gamma", "model.decoder.upsamples.0.upsamples.0.residual.6.bias", "model.decoder.upsamples.0.upsamples.0.residual.6.weight", "model.decoder.upsamples.0.upsamples.1.residual.0.gamma", "model.decoder.upsamples.0.upsamples.1.residual.2.bias", "model.decoder.upsamples.0.upsamples.1.residual.2.weight", "model.decoder.upsamples.0.upsamples.1.residual.3.gamma", "model.decoder.upsamples.0.upsamples.1.residual.6.bias", "model.decoder.upsamples.0.upsamples.1.residual.6.weight", "model.decoder.upsamples.0.upsamples.2.residual.0.gamma", "model.decoder.upsamples.0.upsamples.2.residual.2.bias", "model.decoder.upsamples.0.upsamples.2.residual.2.weight", "model.decoder.upsamples.0.upsamples.2.residual.3.gamma", "model.decoder.upsamples.0.upsamples.2.residual.6.bias", "model.decoder.upsamples.0.upsamples.2.residual.6.weight", "model.decoder.upsamples.0.upsamples.3.resample.1.bias", "model.decoder.upsamples.0.upsamples.3.resample.1.weight", "model.decoder.upsamples.0.upsamples.3.time_conv.bias", "model.decoder.upsamples.0.upsamples.3.time_conv.weight", "model.decoder.upsamples.1.upsamples.0.residual.0.gamma", "model.decoder.upsamples.1.upsamples.0.residual.2.bias", "model.decoder.upsamples.1.upsamples.0.residual.2.weight", "model.decoder.upsamples.1.upsamples.0.residual.3.gamma", "model.decoder.upsamples.1.upsamples.0.residual.6.bias", "model.decoder.upsamples.1.upsamples.0.residual.6.weight", "model.decoder.upsamples.1.upsamples.1.residual.0.gamma", "model.decoder.upsamples.1.upsamples.1.residual.2.bias", "model.decoder.upsamples.1.upsamples.1.residual.2.weight", "model.decoder.upsamples.1.upsamples.1.residual.3.gamma", "model.decoder.upsamples.1.upsamples.1.residual.6.bias", "model.decoder.upsamples.1.upsamples.1.residual.6.weight", "model.decoder.upsamples.1.upsamples.2.residual.0.gamma", "model.decoder.upsamples.1.upsamples.2.residual.2.bias", "model.decoder.upsamples.1.upsamples.2.residual.2.weight", "model.decoder.upsamples.1.upsamples.2.residual.3.gamma", "model.decoder.upsamples.1.upsamples.2.residual.6.bias", "model.decoder.upsamples.1.upsamples.2.residual.6.weight", "model.decoder.upsamples.1.upsamples.3.resample.1.bias", "model.decoder.upsamples.1.upsamples.3.resample.1.weight", "model.decoder.upsamples.1.upsamples.3.time_conv.bias", "model.decoder.upsamples.1.upsamples.3.time_conv.weight", "model.decoder.upsamples.2.upsamples.0.residual.0.gamma", "model.decoder.upsamples.2.upsamples.0.residual.2.bias", "model.decoder.upsamples.2.upsamples.0.residual.2.weight", "model.decoder.upsamples.2.upsamples.0.residual.3.gamma", "model.decoder.upsamples.2.upsamples.0.residual.6.bias", "model.decoder.upsamples.2.upsamples.0.residual.6.weight", "model.decoder.upsamples.2.upsamples.0.shortcut.bias", "model.decoder.upsamples.2.upsamples.0.shortcut.weight", "model.decoder.upsamples.2.upsamples.1.residual.0.gamma", "model.decoder.upsamples.2.upsamples.1.residual.2.bias", "model.decoder.upsamples.2.upsamples.1.residual.2.weight", "model.decoder.upsamples.2.upsamples.1.residual.3.gamma", "model.decoder.upsamples.2.upsamples.1.residual.6.bias", "model.decoder.upsamples.2.upsamples.1.residual.6.weight", "model.decoder.upsamples.2.upsamples.2.residual.0.gamma", "model.decoder.upsamples.2.upsamples.2.residual.2.bias", "model.decoder.upsamples.2.upsamples.2.residual.2.weight", "model.decoder.upsamples.2.upsamples.2.residual.3.gamma", "model.decoder.upsamples.2.upsamples.2.residual.6.bias", "model.decoder.upsamples.2.upsamples.2.residual.6.weight", "model.decoder.upsamples.2.upsamples.3.resample.1.bias", "model.decoder.upsamples.2.upsamples.3.resample.1.weight", "model.decoder.upsamples.3.upsamples.0.residual.0.gamma", "model.decoder.upsamples.3.upsamples.0.residual.2.bias", "model.decoder.upsamples.3.upsamples.0.residual.2.weight", "model.decoder.upsamples.3.upsamples.0.residual.3.gamma", "model.decoder.upsamples.3.upsamples.0.residual.6.bias", "model.decoder.upsamples.3.upsamples.0.residual.6.weight", "model.decoder.upsamples.3.upsamples.0.shortcut.bias", "model.decoder.upsamples.3.upsamples.0.shortcut.weight", "model.decoder.upsamples.3.upsamples.1.residual.0.gamma", "model.decoder.upsamples.3.upsamples.1.residual.2.bias", "model.decoder.upsamples.3.upsamples.1.residual.2.weight", "model.decoder.upsamples.3.upsamples.1.residual.3.gamma", "model.decoder.upsamples.3.upsamples.1.residual.6.bias", "model.decoder.upsamples.3.upsamples.1.residual.6.weight", "model.decoder.upsamples.3.upsamples.2.residual.0.gamma", "model.decoder.upsamples.3.upsamples.2.residual.2.bias", "model.decoder.upsamples.3.upsamples.2.residual.2.weight", "model.decoder.upsamples.3.upsamples.2.residual.3.gamma", "model.decoder.upsamples.3.upsamples.2.residual.6.bias", "model.decoder.upsamples.3.upsamples.2.residual.6.weight".
size mismatch for model.encoder.conv1.weight: copying a param with shape torch.Size([160, 12, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([96, 3, 3, 3, 3]).
size mismatch for model.encoder.conv1.bias: copying a param with shape torch.Size([160]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for model.encoder.middle.0.residual.0.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.middle.0.residual.2.weight: copying a param with shape torch.Size([640, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.encoder.middle.0.residual.2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.middle.0.residual.3.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.middle.0.residual.6.weight: copying a param with shape torch.Size([640, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.encoder.middle.0.residual.6.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.middle.1.norm.gamma: copying a param with shape torch.Size([640, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1]).
size mismatch for model.encoder.middle.1.to_qkv.weight: copying a param with shape torch.Size([1920, 640, 1, 1]) from checkpoint, the shape in current model is torch.Size([1152, 384, 1, 1]).
size mismatch for model.encoder.middle.1.to_qkv.bias: copying a param with shape torch.Size([1920]) from checkpoint, the shape in current model is torch.Size([1152]).
size mismatch for model.encoder.middle.1.proj.weight: copying a param with shape torch.Size([640, 640, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 384, 1, 1]).
size mismatch for model.encoder.middle.1.proj.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.middle.2.residual.0.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.middle.2.residual.2.weight: copying a param with shape torch.Size([640, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.encoder.middle.2.residual.2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.middle.2.residual.3.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.middle.2.residual.6.weight: copying a param with shape torch.Size([640, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.encoder.middle.2.residual.6.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.head.0.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.head.2.weight: copying a param with shape torch.Size([96, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 384, 3, 3, 3]).
size mismatch for model.encoder.head.2.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for model.conv1.weight: copying a param with shape torch.Size([96, 96, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 1, 1, 1]).
size mismatch for model.conv1.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for model.conv2.weight: copying a param with shape torch.Size([48, 48, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 1, 1, 1]).
size mismatch for model.conv2.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for model.decoder.conv1.weight: copying a param with shape torch.Size([1024, 48, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 16, 3, 3, 3]).
size mismatch for model.decoder.conv1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.0.residual.0.gamma: copying a param with shape torch.Size([1024, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.decoder.middle.0.residual.2.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.decoder.middle.0.residual.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.0.residual.3.gamma: copying a param with shape torch.Size([1024, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.decoder.middle.0.residual.6.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.decoder.middle.0.residual.6.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.1.norm.gamma: copying a param with shape torch.Size([1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1]).
size mismatch for model.decoder.middle.1.to_qkv.weight: copying a param with shape torch.Size([3072, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([1152, 384, 1, 1]).
size mismatch for model.decoder.middle.1.to_qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1152]).
size mismatch for model.decoder.middle.1.proj.weight: copying a param with shape torch.Size([1024, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 384, 1, 1]).
size mismatch for model.decoder.middle.1.proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.2.residual.0.gamma: copying a param with shape torch.Size([1024, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.decoder.middle.2.residual.2.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.decoder.middle.2.residual.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.2.residual.3.gamma: copying a param with shape torch.Size([1024, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.decoder.middle.2.residual.6.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.decoder.middle.2.residual.6.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.head.0.gamma: copying a param with shape torch.Size([256, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([96, 1, 1, 1]).
size mismatch for model.decoder.head.2.weight: copying a param with shape torch.Size([12, 256, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 96, 3, 3, 3]).
size mismatch for model.decoder.head.2.bias: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([3]).
The file is this: https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_2_VAE_bf16.safetensors
Double checked by downloading it myself and loading it.
I tried it but got stuck at loading the wan vae. I redowloaded your Wan2_2_VAE_bf16.vae and it's matching the filename from your examplle workflow, but seems the node is expecting a vae in another format...
What do you mean exactly, is there an error?
Yes. this is the error:
Click me!!
WanVideoVAELoader
Error(s) in loading state_dict for WanVideoVAE:
Missing key(s) in state_dict: "model.encoder.downsamples.0.residual.0.gamma", "model.encoder.downsamples.0.residual.2.weight", "model.encoder.downsamples.0.residual.2.bias", "model.encoder.downsamples.0.residual.3.gamma", "model.encoder.downsamples.0.residual.6.weight", "model.encoder.downsamples.0.residual.6.bias", "model.encoder.downsamples.1.residual.0.gamma", "model.encoder.downsamples.1.residual.2.weight", "model.encoder.downsamples.1.residual.2.bias", "model.encoder.downsamples.1.residual.3.gamma", "model.encoder.downsamples.1.residual.6.weight", "model.encoder.downsamples.1.residual.6.bias", "model.encoder.downsamples.2.resample.1.weight", "model.encoder.downsamples.2.resample.1.bias", "model.encoder.downsamples.3.residual.0.gamma", "model.encoder.downsamples.3.residual.2.weight", "model.encoder.downsamples.3.residual.2.bias", "model.encoder.downsamples.3.residual.3.gamma", "model.encoder.downsamples.3.residual.6.weight", "model.encoder.downsamples.3.residual.6.bias", "model.encoder.downsamples.3.shortcut.weight", "model.encoder.downsamples.3.shortcut.bias", "model.encoder.downsamples.4.residual.0.gamma", "model.encoder.downsamples.4.residual.2.weight", "model.encoder.downsamples.4.residual.2.bias", "model.encoder.downsamples.4.residual.3.gamma", "model.encoder.downsamples.4.residual.6.weight", "model.encoder.downsamples.4.residual.6.bias", "model.encoder.downsamples.5.resample.1.weight", "model.encoder.downsamples.5.resample.1.bias", "model.encoder.downsamples.5.time_conv.weight", "model.encoder.downsamples.5.time_conv.bias", "model.encoder.downsamples.6.residual.0.gamma", "model.encoder.downsamples.6.residual.2.weight", "model.encoder.downsamples.6.residual.2.bias", "model.encoder.downsamples.6.residual.3.gamma", "model.encoder.downsamples.6.residual.6.weight", "model.encoder.downsamples.6.residual.6.bias", "model.encoder.downsamples.6.shortcut.weight", "model.encoder.downsamples.6.shortcut.bias", "model.encoder.downsamples.7.residual.0.gamma", "model.encoder.downsamples.7.residual.2.weight", "model.encoder.downsamples.7.residual.2.bias", "model.encoder.downsamples.7.residual.3.gamma", "model.encoder.downsamples.7.residual.6.weight", "model.encoder.downsamples.7.residual.6.bias", "model.encoder.downsamples.8.resample.1.weight", "model.encoder.downsamples.8.resample.1.bias", "model.encoder.downsamples.8.time_conv.weight", "model.encoder.downsamples.8.time_conv.bias", "model.encoder.downsamples.9.residual.0.gamma", "model.encoder.downsamples.9.residual.2.weight", "model.encoder.downsamples.9.residual.2.bias", "model.encoder.downsamples.9.residual.3.gamma", "model.encoder.downsamples.9.residual.6.weight", "model.encoder.downsamples.9.residual.6.bias", "model.encoder.downsamples.10.residual.0.gamma", "model.encoder.downsamples.10.residual.2.weight", "model.encoder.downsamples.10.residual.2.bias", "model.encoder.downsamples.10.residual.3.gamma", "model.encoder.downsamples.10.residual.6.weight", "model.encoder.downsamples.10.residual.6.bias", "model.decoder.upsamples.0.residual.0.gamma", "model.decoder.upsamples.0.residual.2.weight", "model.decoder.upsamples.0.residual.2.bias", "model.decoder.upsamples.0.residual.3.gamma", "model.decoder.upsamples.0.residual.6.weight", "model.decoder.upsamples.0.residual.6.bias", "model.decoder.upsamples.1.residual.0.gamma", "model.decoder.upsamples.1.residual.2.weight", "model.decoder.upsamples.1.residual.2.bias", "model.decoder.upsamples.1.residual.3.gamma", "model.decoder.upsamples.1.residual.6.weight", "model.decoder.upsamples.1.residual.6.bias", "model.decoder.upsamples.2.residual.0.gamma", "model.decoder.upsamples.2.residual.2.weight", "model.decoder.upsamples.2.residual.2.bias", "model.decoder.upsamples.2.residual.3.gamma", "model.decoder.upsamples.2.residual.6.weight", "model.decoder.upsamples.2.residual.6.bias", "model.decoder.upsamples.3.resample.1.weight", "model.decoder.upsamples.3.resample.1.bias", "model.decoder.upsamples.3.time_conv.weight", "model.decoder.upsamples.3.time_conv.bias", "model.decoder.upsamples.4.residual.0.gamma", "model.decoder.upsamples.4.residual.2.weight", "model.decoder.upsamples.4.residual.2.bias", "model.decoder.upsamples.4.residual.3.gamma", "model.decoder.upsamples.4.residual.6.weight", "model.decoder.upsamples.4.residual.6.bias", "model.decoder.upsamples.4.shortcut.weight", "model.decoder.upsamples.4.shortcut.bias", "model.decoder.upsamples.5.residual.0.gamma", "model.decoder.upsamples.5.residual.2.weight", "model.decoder.upsamples.5.residual.2.bias", "model.decoder.upsamples.5.residual.3.gamma", "model.decoder.upsamples.5.residual.6.weight", "model.decoder.upsamples.5.residual.6.bias", "model.decoder.upsamples.6.residual.0.gamma", "model.decoder.upsamples.6.residual.2.weight", "model.decoder.upsamples.6.residual.2.bias", "model.decoder.upsamples.6.residual.3.gamma", "model.decoder.upsamples.6.residual.6.weight", "model.decoder.upsamples.6.residual.6.bias", "model.decoder.upsamples.7.resample.1.weight", "model.decoder.upsamples.7.resample.1.bias", "model.decoder.upsamples.7.time_conv.weight", "model.decoder.upsamples.7.time_conv.bias", "model.decoder.upsamples.8.residual.0.gamma", "model.decoder.upsamples.8.residual.2.weight", "model.decoder.upsamples.8.residual.2.bias", "model.decoder.upsamples.8.residual.3.gamma", "model.decoder.upsamples.8.residual.6.weight", "model.decoder.upsamples.8.residual.6.bias", "model.decoder.upsamples.9.residual.0.gamma", "model.decoder.upsamples.9.residual.2.weight", "model.decoder.upsamples.9.residual.2.bias", "model.decoder.upsamples.9.residual.3.gamma", "model.decoder.upsamples.9.residual.6.weight", "model.decoder.upsamples.9.residual.6.bias", "model.decoder.upsamples.10.residual.0.gamma", "model.decoder.upsamples.10.residual.2.weight", "model.decoder.upsamples.10.residual.2.bias", "model.decoder.upsamples.10.residual.3.gamma", "model.decoder.upsamples.10.residual.6.weight", "model.decoder.upsamples.10.residual.6.bias", "model.decoder.upsamples.11.resample.1.weight", "model.decoder.upsamples.11.resample.1.bias", "model.decoder.upsamples.12.residual.0.gamma", "model.decoder.upsamples.12.residual.2.weight", "model.decoder.upsamples.12.residual.2.bias", "model.decoder.upsamples.12.residual.3.gamma", "model.decoder.upsamples.12.residual.6.weight", "model.decoder.upsamples.12.residual.6.bias", "model.decoder.upsamples.13.residual.0.gamma", "model.decoder.upsamples.13.residual.2.weight", "model.decoder.upsamples.13.residual.2.bias", "model.decoder.upsamples.13.residual.3.gamma", "model.decoder.upsamples.13.residual.6.weight", "model.decoder.upsamples.13.residual.6.bias", "model.decoder.upsamples.14.residual.0.gamma", "model.decoder.upsamples.14.residual.2.weight", "model.decoder.upsamples.14.residual.2.bias", "model.decoder.upsamples.14.residual.3.gamma", "model.decoder.upsamples.14.residual.6.weight", "model.decoder.upsamples.14.residual.6.bias".
Unexpected key(s) in state_dict: "model.encoder.downsamples.0.downsamples.0.residual.0.gamma", "model.encoder.downsamples.0.downsamples.0.residual.2.bias", "model.encoder.downsamples.0.downsamples.0.residual.2.weight", "model.encoder.downsamples.0.downsamples.0.residual.3.gamma", "model.encoder.downsamples.0.downsamples.0.residual.6.bias", "model.encoder.downsamples.0.downsamples.0.residual.6.weight", "model.encoder.downsamples.0.downsamples.1.residual.0.gamma", "model.encoder.downsamples.0.downsamples.1.residual.2.bias", "model.encoder.downsamples.0.downsamples.1.residual.2.weight", "model.encoder.downsamples.0.downsamples.1.residual.3.gamma", "model.encoder.downsamples.0.downsamples.1.residual.6.bias", "model.encoder.downsamples.0.downsamples.1.residual.6.weight", "model.encoder.downsamples.0.downsamples.2.resample.1.bias", "model.encoder.downsamples.0.downsamples.2.resample.1.weight", "model.encoder.downsamples.1.downsamples.0.residual.0.gamma", "model.encoder.downsamples.1.downsamples.0.residual.2.bias", "model.encoder.downsamples.1.downsamples.0.residual.2.weight", "model.encoder.downsamples.1.downsamples.0.residual.3.gamma", "model.encoder.downsamples.1.downsamples.0.residual.6.bias", "model.encoder.downsamples.1.downsamples.0.residual.6.weight", "model.encoder.downsamples.1.downsamples.0.shortcut.bias", "model.encoder.downsamples.1.downsamples.0.shortcut.weight", "model.encoder.downsamples.1.downsamples.1.residual.0.gamma", "model.encoder.downsamples.1.downsamples.1.residual.2.bias", "model.encoder.downsamples.1.downsamples.1.residual.2.weight", "model.encoder.downsamples.1.downsamples.1.residual.3.gamma", "model.encoder.downsamples.1.downsamples.1.residual.6.bias", "model.encoder.downsamples.1.downsamples.1.residual.6.weight", "model.encoder.downsamples.1.downsamples.2.resample.1.bias", "model.encoder.downsamples.1.downsamples.2.resample.1.weight", "model.encoder.downsamples.1.downsamples.2.time_conv.bias", "model.encoder.downsamples.1.downsamples.2.time_conv.weight", "model.encoder.downsamples.2.downsamples.0.residual.0.gamma", "model.encoder.downsamples.2.downsamples.0.residual.2.bias", "model.encoder.downsamples.2.downsamples.0.residual.2.weight", "model.encoder.downsamples.2.downsamples.0.residual.3.gamma", "model.encoder.downsamples.2.downsamples.0.residual.6.bias", "model.encoder.downsamples.2.downsamples.0.residual.6.weight", "model.encoder.downsamples.2.downsamples.0.shortcut.bias", "model.encoder.downsamples.2.downsamples.0.shortcut.weight", "model.encoder.downsamples.2.downsamples.1.residual.0.gamma", "model.encoder.downsamples.2.downsamples.1.residual.2.bias", "model.encoder.downsamples.2.downsamples.1.residual.2.weight", "model.encoder.downsamples.2.downsamples.1.residual.3.gamma", "model.encoder.downsamples.2.downsamples.1.residual.6.bias", "model.encoder.downsamples.2.downsamples.1.residual.6.weight", "model.encoder.downsamples.2.downsamples.2.resample.1.bias", "model.encoder.downsamples.2.downsamples.2.resample.1.weight", "model.encoder.downsamples.2.downsamples.2.time_conv.bias", "model.encoder.downsamples.2.downsamples.2.time_conv.weight", "model.encoder.downsamples.3.downsamples.0.residual.0.gamma", "model.encoder.downsamples.3.downsamples.0.residual.2.bias", "model.encoder.downsamples.3.downsamples.0.residual.2.weight", "model.encoder.downsamples.3.downsamples.0.residual.3.gamma", "model.encoder.downsamples.3.downsamples.0.residual.6.bias", "model.encoder.downsamples.3.downsamples.0.residual.6.weight", "model.encoder.downsamples.3.downsamples.1.residual.0.gamma", "model.encoder.downsamples.3.downsamples.1.residual.2.bias", "model.encoder.downsamples.3.downsamples.1.residual.2.weight", "model.encoder.downsamples.3.downsamples.1.residual.3.gamma", "model.encoder.downsamples.3.downsamples.1.residual.6.bias", "model.encoder.downsamples.3.downsamples.1.residual.6.weight", "model.decoder.upsamples.0.upsamples.0.residual.0.gamma", "model.decoder.upsamples.0.upsamples.0.residual.2.bias", "model.decoder.upsamples.0.upsamples.0.residual.2.weight", "model.decoder.upsamples.0.upsamples.0.residual.3.gamma", "model.decoder.upsamples.0.upsamples.0.residual.6.bias", "model.decoder.upsamples.0.upsamples.0.residual.6.weight", "model.decoder.upsamples.0.upsamples.1.residual.0.gamma", "model.decoder.upsamples.0.upsamples.1.residual.2.bias", "model.decoder.upsamples.0.upsamples.1.residual.2.weight", "model.decoder.upsamples.0.upsamples.1.residual.3.gamma", "model.decoder.upsamples.0.upsamples.1.residual.6.bias", "model.decoder.upsamples.0.upsamples.1.residual.6.weight", "model.decoder.upsamples.0.upsamples.2.residual.0.gamma", "model.decoder.upsamples.0.upsamples.2.residual.2.bias", "model.decoder.upsamples.0.upsamples.2.residual.2.weight", "model.decoder.upsamples.0.upsamples.2.residual.3.gamma", "model.decoder.upsamples.0.upsamples.2.residual.6.bias", "model.decoder.upsamples.0.upsamples.2.residual.6.weight", "model.decoder.upsamples.0.upsamples.3.resample.1.bias", "model.decoder.upsamples.0.upsamples.3.resample.1.weight", "model.decoder.upsamples.0.upsamples.3.time_conv.bias", "model.decoder.upsamples.0.upsamples.3.time_conv.weight", "model.decoder.upsamples.1.upsamples.0.residual.0.gamma", "model.decoder.upsamples.1.upsamples.0.residual.2.bias", "model.decoder.upsamples.1.upsamples.0.residual.2.weight", "model.decoder.upsamples.1.upsamples.0.residual.3.gamma", "model.decoder.upsamples.1.upsamples.0.residual.6.bias", "model.decoder.upsamples.1.upsamples.0.residual.6.weight", "model.decoder.upsamples.1.upsamples.1.residual.0.gamma", "model.decoder.upsamples.1.upsamples.1.residual.2.bias", "model.decoder.upsamples.1.upsamples.1.residual.2.weight", "model.decoder.upsamples.1.upsamples.1.residual.3.gamma", "model.decoder.upsamples.1.upsamples.1.residual.6.bias", "model.decoder.upsamples.1.upsamples.1.residual.6.weight", "model.decoder.upsamples.1.upsamples.2.residual.0.gamma", "model.decoder.upsamples.1.upsamples.2.residual.2.bias", "model.decoder.upsamples.1.upsamples.2.residual.2.weight", "model.decoder.upsamples.1.upsamples.2.residual.3.gamma", "model.decoder.upsamples.1.upsamples.2.residual.6.bias", "model.decoder.upsamples.1.upsamples.2.residual.6.weight", "model.decoder.upsamples.1.upsamples.3.resample.1.bias", "model.decoder.upsamples.1.upsamples.3.resample.1.weight", "model.decoder.upsamples.1.upsamples.3.time_conv.bias", "model.decoder.upsamples.1.upsamples.3.time_conv.weight", "model.decoder.upsamples.2.upsamples.0.residual.0.gamma", "model.decoder.upsamples.2.upsamples.0.residual.2.bias", "model.decoder.upsamples.2.upsamples.0.residual.2.weight", "model.decoder.upsamples.2.upsamples.0.residual.3.gamma", "model.decoder.upsamples.2.upsamples.0.residual.6.bias", "model.decoder.upsamples.2.upsamples.0.residual.6.weight", "model.decoder.upsamples.2.upsamples.0.shortcut.bias", "model.decoder.upsamples.2.upsamples.0.shortcut.weight", "model.decoder.upsamples.2.upsamples.1.residual.0.gamma", "model.decoder.upsamples.2.upsamples.1.residual.2.bias", "model.decoder.upsamples.2.upsamples.1.residual.2.weight", "model.decoder.upsamples.2.upsamples.1.residual.3.gamma", "model.decoder.upsamples.2.upsamples.1.residual.6.bias", "model.decoder.upsamples.2.upsamples.1.residual.6.weight", "model.decoder.upsamples.2.upsamples.2.residual.0.gamma", "model.decoder.upsamples.2.upsamples.2.residual.2.bias", "model.decoder.upsamples.2.upsamples.2.residual.2.weight", "model.decoder.upsamples.2.upsamples.2.residual.3.gamma", "model.decoder.upsamples.2.upsamples.2.residual.6.bias", "model.decoder.upsamples.2.upsamples.2.residual.6.weight", "model.decoder.upsamples.2.upsamples.3.resample.1.bias", "model.decoder.upsamples.2.upsamples.3.resample.1.weight", "model.decoder.upsamples.3.upsamples.0.residual.0.gamma", "model.decoder.upsamples.3.upsamples.0.residual.2.bias", "model.decoder.upsamples.3.upsamples.0.residual.2.weight", "model.decoder.upsamples.3.upsamples.0.residual.3.gamma", "model.decoder.upsamples.3.upsamples.0.residual.6.bias", "model.decoder.upsamples.3.upsamples.0.residual.6.weight", "model.decoder.upsamples.3.upsamples.0.shortcut.bias", "model.decoder.upsamples.3.upsamples.0.shortcut.weight", "model.decoder.upsamples.3.upsamples.1.residual.0.gamma", "model.decoder.upsamples.3.upsamples.1.residual.2.bias", "model.decoder.upsamples.3.upsamples.1.residual.2.weight", "model.decoder.upsamples.3.upsamples.1.residual.3.gamma", "model.decoder.upsamples.3.upsamples.1.residual.6.bias", "model.decoder.upsamples.3.upsamples.1.residual.6.weight", "model.decoder.upsamples.3.upsamples.2.residual.0.gamma", "model.decoder.upsamples.3.upsamples.2.residual.2.bias", "model.decoder.upsamples.3.upsamples.2.residual.2.weight", "model.decoder.upsamples.3.upsamples.2.residual.3.gamma", "model.decoder.upsamples.3.upsamples.2.residual.6.bias", "model.decoder.upsamples.3.upsamples.2.residual.6.weight".
size mismatch for model.encoder.conv1.weight: copying a param with shape torch.Size([160, 12, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([96, 3, 3, 3, 3]).
size mismatch for model.encoder.conv1.bias: copying a param with shape torch.Size([160]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for model.encoder.middle.0.residual.0.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.middle.0.residual.2.weight: copying a param with shape torch.Size([640, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.encoder.middle.0.residual.2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.middle.0.residual.3.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.middle.0.residual.6.weight: copying a param with shape torch.Size([640, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.encoder.middle.0.residual.6.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.middle.1.norm.gamma: copying a param with shape torch.Size([640, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1]).
size mismatch for model.encoder.middle.1.to_qkv.weight: copying a param with shape torch.Size([1920, 640, 1, 1]) from checkpoint, the shape in current model is torch.Size([1152, 384, 1, 1]).
size mismatch for model.encoder.middle.1.to_qkv.bias: copying a param with shape torch.Size([1920]) from checkpoint, the shape in current model is torch.Size([1152]).
size mismatch for model.encoder.middle.1.proj.weight: copying a param with shape torch.Size([640, 640, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 384, 1, 1]).
size mismatch for model.encoder.middle.1.proj.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.middle.2.residual.0.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.middle.2.residual.2.weight: copying a param with shape torch.Size([640, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.encoder.middle.2.residual.2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.middle.2.residual.3.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.middle.2.residual.6.weight: copying a param with shape torch.Size([640, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.encoder.middle.2.residual.6.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.encoder.head.0.gamma: copying a param with shape torch.Size([640, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.encoder.head.2.weight: copying a param with shape torch.Size([96, 640, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 384, 3, 3, 3]).
size mismatch for model.encoder.head.2.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for model.conv1.weight: copying a param with shape torch.Size([96, 96, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 1, 1, 1]).
size mismatch for model.conv1.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for model.conv2.weight: copying a param with shape torch.Size([48, 48, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 1, 1, 1]).
size mismatch for model.conv2.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for model.decoder.conv1.weight: copying a param with shape torch.Size([1024, 48, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 16, 3, 3, 3]).
size mismatch for model.decoder.conv1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.0.residual.0.gamma: copying a param with shape torch.Size([1024, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.decoder.middle.0.residual.2.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.decoder.middle.0.residual.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.0.residual.3.gamma: copying a param with shape torch.Size([1024, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.decoder.middle.0.residual.6.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.decoder.middle.0.residual.6.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.1.norm.gamma: copying a param with shape torch.Size([1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1]).
size mismatch for model.decoder.middle.1.to_qkv.weight: copying a param with shape torch.Size([3072, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([1152, 384, 1, 1]).
size mismatch for model.decoder.middle.1.to_qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1152]).
size mismatch for model.decoder.middle.1.proj.weight: copying a param with shape torch.Size([1024, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 384, 1, 1]).
size mismatch for model.decoder.middle.1.proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.2.residual.0.gamma: copying a param with shape torch.Size([1024, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.decoder.middle.2.residual.2.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.decoder.middle.2.residual.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.middle.2.residual.3.gamma: copying a param with shape torch.Size([1024, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 1, 1, 1]).
size mismatch for model.decoder.middle.2.residual.6.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3, 3]).
size mismatch for model.decoder.middle.2.residual.6.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for model.decoder.head.0.gamma: copying a param with shape torch.Size([256, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([96, 1, 1, 1]).
size mismatch for model.decoder.head.2.weight: copying a param with shape torch.Size([12, 256, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 96, 3, 3, 3]).
size mismatch for model.decoder.head.2.bias: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([3]).
The file is this: https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_2_VAE_bf16.safetensors
Double checked by downloading it myself and loading it.
still the same error.... also renamed it to be sure it's the new file. maybe the problem sits on another spot but the vae is triggering it for some following steps?
I tried it but got stuck at loading the wan vae. I redowloaded your Wan2_2_VAE_bf16.vae and it's matching the filename from your examplle workflow, but seems the node is expecting a vae in another format...
What do you mean exactly, is there an error?
Yes. this is the error:
The file is this: https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_2_VAE_bf16.safetensors
Double checked by downloading it myself and loading it.
fixed the VAE problem! I had MultiTalk Custom nodes that had duplicates of the WanVideoWrapper custon nodes and seems it loaded the wrong one. I removed the Multitalk custom nodes. So the vae loading works fine now. But Now I'm stuck at the WanVideo Sampler with this error:
WanVideoSampler
cannot import name 'AttrsDescriptor' from 'triton.compiler.compiler' (F:\ComfyUI_3\ComfyUI_windows_portable\python_embeded\lib\site-packages\triton\compiler\compiler.py)
So seems I have to update Triton on Windows? I remember that was very frustating to get it installed =)
EDIT: is the Triton version really the problem? Means I have to update a lot of stuff along with it which will probably break a lot of stuff...
EDIT: is the Triton version really the problem? Means I have to update a lot of stuff along with it which will probably break a lot of stuff...
It depends so much on everything else that it's hard to answer, in general when using Windows following the instructions here is the best approach:
https://github.com/woct0rdho/triton-windows
Or you can just not use Triton by disabling torch.compile and/or sageattention, they just speed everything up by a lot.
EDIT: is the Triton version really the problem? Means I have to update a lot of stuff along with it which will probably break a lot of stuff...
It depends so much on everything else that it's hard to answer, in general when using Windows following the instructions here is the best approach:
https://github.com/woct0rdho/triton-windows
Or you can just not use Triton by disabling torch.compile and/or sageattention, they just speed everything up by a lot.
I added --disable-pytorch-compile to the run_nvidia_gpu.bat but comfyUI doesn't know the command on my version. So I updated comfyUI with dependencies which then also breaked everything =D
I'm on Triton version 3.2 . If yours is never and working then updating triton seems to be the way to go
EDIT: is the Triton version really the problem? Means I have to update a lot of stuff along with it which will probably break a lot of stuff...
It depends so much on everything else that it's hard to answer, in general when using Windows following the instructions here is the best approach:
https://github.com/woct0rdho/triton-windows
Or you can just not use Triton by disabling torch.compile and/or sageattention, they just speed everything up by a lot.
Updating Triton did the trick! I am on Tritin 3.5 now