Update model card for PresentAgent

#11

by nielsr HF Staff - opened Jul 9, 2025

base: refs/heads/main

←

from: refs/pr/11

Discussion Files changed

+159

-71

nielsr

Jul 9, 2025

This PR updates the model card for the ByteDance/MegaTTS3 repository to reflect its association with and hosting of components for PresentAgent: Multimodal Agent for Presentation Video Generation.

The changes include:

Updating the model description to focus on PresentAgent, as detailed in its paper (PresentAgent: Multimodal Agent for Presentation Video Generation).
Adding the correct library_name as transformers (due to the presence of Qwen2 model components) and tags: multimodal-agent.
Maintaining the existing pipeline_tag: text-to-speech, as explicitly requested in the task.
Providing direct links to the paper, the official GitHub repository, and the Colab demo.
Incorporating comprehensive sections on PresentAgent's introduction, setup and usage instructions, benchmark details, experiment results, contribution guidelines, and acknowledgements, all sourced from the official GitHub repository.
Retaining the repository's license and security information.
Updating the BibTeX Entry and Citation Info to include the PresentAgent paper's citation alongside the existing MegaTTS3 and Wavtokenizer citations, reflecting their roles as integral components.

This update aims to provide a more accurate and comprehensive overview of the artifact hosted here, aligning it with the associated research paper.

Update model card for PresentAgentef00a35e

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment