* External authors




DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models

Zhenting Wang

Chen Chen

Lingjuan Lyu

Dimitris N. Metaxas*

Shiqing Ma*

* External authors

ICLR 2024



Recent text-to-image diffusion models have shown surprising performance in generating high-quality images. However, concerns have arisen regarding the unauthorized data usage during the training or fine-tuning process. One example is when a model trainer collects a set of images created by a particular artist and attempts to train a model capable of generating similar images without obtaining permission and giving credit to the artist. To address this issue, we propose a method for detecting such unauthorized data usage by planting the injected memorization into the text-to-image diffusion models trained on the protected dataset. Specifically, we modify the protected images by adding unique contents on these images using stealthy image warping functions that are nearly imperceptible to human but can be captured and memorized by diffusion models. By analyzing whether the model has memorized the injected content (i.e., whether the generated images are processed by the injected post-processing function), we can detect models that had illegally utilized the unauthorized data. Experiments on Stable Diffusion and VQ Diffusion with different model training or fine-tuning methods (i.e, LoRA, DreamBooth, and standard training) demonstrate the effectiveness of our proposed method in detecting unauthorized data usages.

Related Publications

PerceptAnon: Exploring the Human Perception of Image Anonymization Beyond Pseudonymization for GDPR

ICML, 2024
Kartik Patwari, Chen-Nee Chuah*, Lingjuan Lyu, Vivek Sharma

Current image anonymization techniques, largely focus on localized pseudonymization, typically modify identifiable features like faces or full bodies and evaluate anonymity through metrics such as detection and re-identification rates. However, this approach often overlooks …

COALA: A Practical and Vision-Centric Federated Learning Platform

ICML, 2024
Weiming Zhuang, Jian Xu, Chen Chen, Jingtao Li, Lingjuan Lyu

We present COALA, a vision-centric Federated Learning (FL) platform, and a suite of benchmarks for practical FL scenarios, which we categorize as task, data, and model levels. At the task level, COALA extends support from simple classification to 15 computer vision tasks, in…

How to Trace Latent Generative Model Generated Images without Artificial Watermark?

ICML, 2024
Zhenting Wang, Vikash Sehwag, Chen Chen, Lingjuan Lyu, Dimitris N. Metaxas*, Shiqing Ma*

Latent generative models (e.g., Stable Diffusion) have become more and more popular, but concerns have arisen regarding potential misuse related to images generated by these models. It is, therefore, necessary to analyze the origin of images by inferring if a particular imag…

  • HOME
  • Publications
  • DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models


Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.