
Qualitative results of video generation using Wan-Alpha. Our model successfully generates various scenes with accurate and clearly rendered transparency. Notably, it can synthesize diverse semi-transparent objects, glowing effects, and fine-grained details such as hair.
π₯ News
- [2025.09.30] Released Wan-Alpha v1.0, the Wan2.1-14B-T2Vβadapted weights and inference code are now open-sourced.
π Showcase
Text-to-Video Generation with Alpha Channel
Prompt | Preview Video | Alpha Video |
---|---|---|
"Medium shot. A little girl holds a bubble wand and blows out colorful bubbles that float and pop in the air. The background of this video is transparent. Realistic style." | ![]() |
![]() |
For more results, please visit Our Website
π Quick Start
Please see Github for code running details
π€ Acknowledgements
This project is built upon the following excellent open-source projects:
- DiffSynth-Studio (training/inference framework)
- Wan2.1 (base video generation model)
- LightX2V (inference acceleration)
- WanVideo_comfy (inference acceleration)
We sincerely thank the authors and contributors of these projects.
β Citation
If you find our work helpful for your research, please consider citing our paper:
@misc{dong2025wanalpha,
title={Wan-Alpha: High-Quality Text-to-Video Generation with Alpha Channel},
author={Haotian Dong and Wenjing Wang and Chen Li and Di Lin},
year={2025},
eprint={2509.24979},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2509.24979},
}
π¬ Contact Us
If you have any questions or suggestions, feel free to reach out via GitHub Issues . We look forward to your feedback!
Model tree for htdong/Wan-Alpha_ComfyUI
Base model
Wan-AI/Wan2.1-T2V-14B