FastVideo FastWan2.1-T2V-14B-480P-Diffusers

Introduction

This model is jointly finetuned with DMD and VSA, based on Wan-AI/Wan2.1-T2V-14B-Diffusers. It supports efficient 3-step inference and generates high-quality videos at 61×448×832 resolution. We adopt the FastVideo 480P Synthetic Wan dataset, consisting of 600k synthetic latents.

Model Overview

3-step inference is supported and achieves up to 50x speed up for denoising loop on a single H100 GPU.
Our model is trained on 61×448×832 resolution, but it supports generating videos with any resolution.(quality may degrade)
Finetuning and inference scripts are available in the FastVideo repository:
Try it out on FastVideo — we support a wide range of GPUs from H100 to 4090, and also support Mac users!

Training Infrastructure

Training was conducted on 8 nodes with 64 H200 GPUs in total, using a global batch size = 64.
We enable gradient checkpointing, set HSDP_shard_dim = 8, sequence_parallel_size = 4, and use learning rate = 1e-5.
We set VSA attention sparsity to 0.9, and training runs for 3000 steps (~52 hours)

If you use FastWan2.1-T2V-14B-480P-Diffusers model for your research, please cite our paper:

@article{zhang2025vsa,
  title={VSA: Faster Video Diffusion with Trainable Sparse Attention},
  author={Zhang, Peiyuan and Huang, Haofeng and Chen, Yongqi and Lin, Will and Liu, Zhengzhong and Stoica, Ion and Xing, Eric and Zhang, Hao},
  journal={arXiv preprint arXiv:2505.13389},
  year={2025}
}
@article{zhang2025fast,
  title={Fast video generation with sliding tile attention},
  author={Zhang, Peiyuan and Chen, Yongqi and Su, Runlong and Ding, Hangliang and Stoica, Ion and Liu, Zhengzhong and Zhang, Hao},
  journal={arXiv preprint arXiv:2502.04507},
  year={2025}
}

FastVideo
/

FastWan2.1-T2V-14B-480P-Diffusers

FastVideo FastWan2.1-T2V-14B-480P-Diffusers

Introduction

Model Overview

Training Infrastructure

Model tree for FastVideo/FastWan2.1-T2V-14B-480P-Diffusers

Dataset used to train FastVideo/FastWan2.1-T2V-14B-480P-Diffusers

Collection including FastVideo/FastWan2.1-T2V-14B-480P-Diffusers

VSA Models