Authors

Venue

Date

Share

Classifier-Free Guidance inside the Attraction Basin May Cause Memorization

Anubhav Jain

Yuya Kobayashi

Takashi Shibuya

Yuhta Takida

Nasir Memon

Julian Togelius

Yuki Mitsufuji

CVPR-25

2025

Abstract

Diffusion models are prone to exactly reproduce images from the training data. This exact reproduction of the training data is concerning as it can lead to copyright infringement and/or leakage of privacy-sensitive information. In this paper, we present a novel way to understand the memorization phenomenon, and propose a simple yet effective approach to mitigate it. We argue that memorization occurs because of an attraction basin in the denoising process which steers the diffusion trajectory towards a memorized image. However, this can be mitigated by guiding the diffusion trajectory away from the attraction basin by not applying classifier-free guidance until an ideal transition point occurs from which classifier-free guidance is applied. This leads to the generation of non-memorized images that are high in image quality and well-aligned with the conditioning mechanism. To further improve on this, we present a new guidance technique, \emph{opposite guidance}, that escapes the attraction basin sooner in the denoising process. We demonstrate the existence of attraction basins in various scenarios in which memorization occurs, and we show that our proposed approach successfully mitigates memorization.

Related Publications

Diffusion-based Signal Refiner for Speech Enhancement and Separation

IEEE, 2026
Ryosuke Sawata, Masato Hirano*, Naoki Murata, Shusuke Takahashi*, Yuki Mitsufuji

Although recent speech processing technologies have achieved significant improvements in objective metrics, there still remains a gap in human perceptual quality. This paper proposes Diffiner, a novel solution that utilizes the powerful generative capability of diffusion mod…

PAVAS: Physics-Aware Video-to-Audio Synthesis

CVPR, 2026
Oh Hyun-Bin*, Yuhta Takida, Toshimitsu Uesaka, Tae-Hyun Oh*, Yuki Mitsufuji

Recent advances in Video-to-Audio (V2A) generation have achieved impressive perceptual quality and temporal synchronization, yet most models remain appearance-driven, capturing visual-acoustic correlations without considering the physical factors that shape real-world sounds…

MeanFlow Transformers with Representation Autoencoders

CVPR, 2026
Zheyuan Hu*, Chieh-Hsin Lai, Ge Wu*, Yuki Mitsufuji, Stefano Ermon*

MeanFlow (MF) is a diffusion-motivated generative model that enables efficient few-step generation by learning long jumps directly from noise to data. In practice, it is often used as a latent MF by leveraging the pre-trained Stable Diffusion variational autoencoder (SD-VAE)…

  • HOME
  • Publications
  • Classifier-Free Guidance inside the Attraction Basin May Cause Memorization

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.