Time to ditch SDXL?
I feel like SDXL is showing it's age and it's time to ditch it for a better model like Wan2.1. I am mentioning Wan2.1 because imo it can produce the best images out of any open source model right now (try it).
Why Wan2.1 over Flux?
1: Faster in image gen.
2: Better aesthetic.
3: Can do high res (1920*1280) gens in 15-20 seconds on my 4080 16GB.
4: No "AI slop" look.
5: Undistilled. So training is possible without breaking it.
6: Has multiple ways to control images and videos for it's ecosystem (Phantom, VACE, ATI, Skyreels A2, MAGREF, Multitalk, Wan-Fun, etc)
7: Has multiple distill loras and models.
8: Apache 2.0 licensed.
9: 4-6x speed (nunchaku, radial attention, SVG) incoming.
10: Almost all future research being done on DiT models like Wan2.1.
https://www.reddit.com/r/StableDiffusion/comments/1lu7nxx/wan_21_txt2img_is_amazing/
https://www.pruna.ai/blog/wan-image-juiced-image-generation
Unrelated: I love your work.
Thank you, I'll definitely be taking a look. At the moment I'd say it's between Wan2.1 and Chroma. The biggest drawback with Wan2.1, on a quick glance, is its size (14B parameters) vs Chroma (9B), so it would require heavier quantization for people to run I think. But I'll give them a whirl and see.
Thank you, I'll definitely be taking a look. At the moment I'd say it's between Wan2.1 and Chroma. The biggest drawback with Wan2.1, on a quick glance, is its size (14B parameters) vs Chroma (9B), so it would require heavier quantization for people to run I think. But I'll give them a whirl and see.
Wan2.2 is going to have a 5B version as well as a 14B MoE model. And I think the 5B model would be a great contender to Chroma if it's as good as 2.1 14B.
Thank you, I'll definitely be taking a look. At the moment I'd say it's between Wan2.1 and Chroma. The biggest drawback with Wan2.1, on a quick glance, is its size (14B parameters) vs Chroma (9B), so it would require heavier quantization for people to run I think. But I'll give them a whirl and see.
A) Wan 2.2 is out, if anything do that, not 2.1
B) I'd broadly argue that Chroma is more likely a far more worthwhile target in terms of time-spent-training to actual-users-had-of-the-model-when-its-done. Flux-sized models are a LOT more reasonable for a LOT more people, and given what you've already done with SDXL for example up to this point I really don't think the benefits of WAN anything on the Flux arch would really outweigh the downsides, personally.
Thank you, I'll definitely be taking a look. At the moment I'd say it's between Wan2.1 and Chroma. The biggest drawback with Wan2.1, on a quick glance, is its size (14B parameters) vs Chroma (9B), so it would require heavier quantization for people to run I think. But I'll give them a whirl and see.
A) Wan 2.2 is out, if anything do that, not 2.1
B) I'd broadly argue that Chroma is more likely a far more worthwhile target in terms of time-spent-training to actual-users-had-of-the-model-when-its-done. Flux-sized models are a LOT more reasonable for a LOT more people, and given what you've already done with SDXL for example up to this point I really don't think the benefits of WAN anything on the Flux arch would really outweigh the downsides, personally.
Chroma, while a good model, produces a lot of artifacts. It also doesn't have any models for controllable generations and it still has that AI slop Flux look. Wan produces by far the best realistic images out of the box due to it not being a plastic skin fiesta. I will make some image comparisons soon, you will realize how much better Wan is at image generations.
Thank you, I'll definitely be taking a look. At the moment I'd say it's between Wan2.1 and Chroma. The biggest drawback with Wan2.1, on a quick glance, is its size (14B parameters) vs Chroma (9B), so it would require heavier quantization for people to run I think. But I'll give them a whirl and see.
A) Wan 2.2 is out, if anything do that, not 2.1
B) I'd broadly argue that Chroma is more likely a far more worthwhile target in terms of time-spent-training to actual-users-had-of-the-model-when-its-done. Flux-sized models are a LOT more reasonable for a LOT more people, and given what you've already done with SDXL for example up to this point I really don't think the benefits of WAN anything on the Flux arch would really outweigh the downsides, personally.Chroma, while a good model, produces a lot of artifacts. It also doesn't have any models for controllable generations and it still has that AI slop Flux look. Wan produces by far the best realistic images out of the box due to it not being a plastic skin fiesta. I will make some image comparisons soon, you will realize how much better Wan is at image generations.
I'd argue Chroma knows like, WAY more concepts of the sort relevant to bigASP very well already though. I think it'd be a much much easier "jumping off point" than WAN, which would need to be taught essentially all concepts completely from scratch in bigASP.