Papers
arxiv:2510.21461

Enhancing Video Inpainting with Aligned Frame Interval Guidance

Published on Oct 24
Authors:
,
,
,
,

Abstract

Recent image-to-video (I2V) based video inpainting methods have made significant strides by leveraging single-image priors and modeling temporal consistency across masked frames. Nevertheless, these methods suffer from severe content degradation within video chunks. Furthermore, the absence of a robust frame alignment scheme compromises intra-chunk and inter-chunk spatiotemporal stability, resulting in insufficient control over the entire video. To address these limitations, we propose VidPivot, a novel framework that decouples video inpainting into two sub-tasks: multi-frame consistent image inpainting and masked area motion propagation. Our approach introduces frame interval priors as spatiotemporal cues to guide the inpainting process. To enhance cross-frame coherence, we design a FrameProp Module that implements a frame content propagation strategy, diffusing reference frame content into subsequent frames via a splicing mechanism. Additionally, a dedicated context controller encodes these coherent frame priors into the I2V generative backbone, effectively serving as soft constrain to suppress content distortion during generation. Extensive evaluations demonstrate that VidPivot achieves competitive performance across diverse benchmarks and generalizes well to different video inpainting scenarios.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2510.21461 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2510.21461 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.21461 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.