STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment Paper • 2409.08601 • Published Sep 13, 2024
Towards Diverse and Efficient Audio Captioning via Diffusion Models Paper • 2409.09401 • Published Sep 14, 2024 • 7
EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer Paper • 2409.10819 • Published Sep 17, 2024 • 20