Wan-Alpha

Wan-Alpha: High-Quality Text-to-Video Generation with Alpha Channel

Qualitative results of video generation using Wan-Alpha. Our model successfully generates various scenes with accurate and clearly rendered transparency. Notably, it can synthesize diverse semi-transparent objects, glowing effects, and fine-grained details such as hair.

🔥 News

[2025.09.30] Released Wan-Alpha v1.0, the Wan2.1-14B-T2V–adapted weights and inference code are now open-sourced.

🌟 Showcase

Text-to-Video Generation with Alpha Channel

Prompt	Preview Video	Alpha Video
"Medium shot. A little girl holds a bubble wand and blows out colorful bubbles that float and pop in the air. The background of this video is transparent. Realistic style."

For more results, please visit Our Website

🚀 Quick Start

Please see Github for code running details

🤝 Acknowledgements

This project is built upon the following excellent open-source projects:

DiffSynth-Studio (training/inference framework)
Wan2.1 (base video generation model)
LightX2V (inference acceleration)
WanVideo_comfy (inference acceleration)

We sincerely thank the authors and contributors of these projects.

✏ Citation

If you find our work helpful for your research, please consider citing our paper:

@misc{dong2025wanalpha,
      title={Wan-Alpha: High-Quality Text-to-Video Generation with Alpha Channel}, 
      author={Haotian Dong and Wenjing Wang and Chen Li and Di Lin},
      year={2025},
      eprint={2509.24979},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2509.24979}, 
}

📬 Contact Us

If you have any questions or suggestions, feel free to reach out via GitHub Issues . We look forward to your feedback!

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for htdong/Wan-Alpha_ComfyUI

Base model

Wan-AI/Wan2.1-T2V-14B

Finetuned

(33)

this model