Semantically Consistent Video Inpainting with Conditional Diffusion Models. Green, D., Harvey, W., Naderiparizi, S., Niedoba, M., Liu, Y., Liang, X., Lavington, J., Zhang, K., Lioutas, V., Dabiri, S., Scibior, A., Zwartsenberg, B., & Wood, F. 2024.
Semantically Consistent Video Inpainting with Conditional Diffusion Models [link]Arxiv  Semantically Consistent Video Inpainting with Conditional Diffusion Models [link]Pdf  doi  abstract   bibtex   1 download  
Current state-of-the-art methods for video inpainting typically rely on optical flow or attention-based approaches to inpaint masked regions by propagating visual information across frames. While such approaches have led to significant progress on standard benchmarks, they struggle with tasks that require the synthesis of novel content that is not present in other frames. In this paper we reframe video inpainting as a conditional generative modeling problem and present a framework for solving such problems with conditional video diffusion models. We highlight the advantages of using a generative approach for this task, showing that our method is capable of generating diverse, high-quality inpaintings and synthesizing new content that is spatially, temporally, and semantically consistent with the provided context.
@unpublished{,
	doi={/2405.00251},
	url_ArXiv={https://arxiv.org/abs/2405.00251},
	url_pdf={https://arxiv.org/pdf/2405.00251},
	author={Green, Dylan and Harvey, William and Naderiparizi, Saeid and Niedoba, Matthew, and Liu, Yunpeng and Liang, Xiaoxuan and Lavington, Jonathan and Zhang, Ke and Lioutas, Vasileios and Dabiri, Setareh and Scibior, Adam and Zwartsenberg, Berend and Wood, Frank},
	title={Semantically Consistent Video Inpainting with Conditional Diffusion Models},
	publisher={arXiv},
	year={2024},
	copyright={arXiv.org perpetual, non-exclusive licence},
	abstract={Current state-of-the-art methods for video inpainting typically rely on optical flow or attention-based approaches to inpaint masked regions by propagating visual information across frames. While such approaches have led to significant progress on standard benchmarks, they struggle with tasks that require the synthesis of novel content that is not present in other frames. In this paper we reframe video inpainting as a conditional generative modeling problem and present a framework for solving such problems with conditional video diffusion models. We highlight the advantages of using a generative approach for this task, showing that our method is capable of generating diverse, high-quality inpaintings and synthesizing new content that is spatially, temporally, and semantically consistent with the provided context.},
}

Downloads: 1