AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models
Abstract
AnyMoLe uses video diffusion models and ICAdapt fine-tuning to generate motion in-between frames for arbitrary characters without extensive datasets, ensuring smooth and realistic transitions.
Despite recent advancements in learning-based motion in-betweening, a key limitation has been overlooked: the requirement for character-specific datasets. In this work, we introduce AnyMoLe, a novel method that addresses this limitation by leveraging video diffusion models to generate motion in-between frames for arbitrary characters without external data. Our approach employs a two-stage frame generation process to enhance contextual understanding. Furthermore, to bridge the domain gap between real-world and rendered character animations, we introduce ICAdapt, a fine-tuning technique for video diffusion models. Additionally, we propose a ``motion-video mimicking'' optimization technique, enabling seamless motion generation for characters with arbitrary joint structures using 2D and 3D-aware features. AnyMoLe significantly reduces data dependency while generating smooth and realistic transitions, making it applicable to a wide range of motion in-betweening tasks.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- HumanDiT: Pose-Guided Diffusion Transformer for Long-form Human Motion Video Generation (2025)
- MotionMatcher: Motion Customization of Text-to-Video Diffusion Models via Motion Feature Matching (2025)
- TransVDM: Motion-Constrained Video Diffusion Model for Transparent Video Synthesis (2025)
- CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation (2025)
- AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion (2025)
- SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion (2025)
- How to Move Your Dragon: Text-to-Motion Synthesis for Large-Vocabulary Objects (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper