* External authors




CATER: Intellectual Property Protection on Text Generation APIs via Conditional Watermarks

Xuanli He*

Qiongkai Xu*

Yi Zeng

Lingjuan Lyu

Fangzhao Wu*

Jiwei Li*

Ruoxi Jia*

* External authors

NeurIPS 2022



Previous works have validated that text generation APIs can be stolen through imitation attacks, causing IP violations. In order to protect the IP of text generation APIs, a recent work has introduced a watermarking algorithm and utilized the null-hypothesis test as a post-hoc ownership verification on the imitation models. However, we find that it is possible to detect those watermarks via sufficient statistics of the frequencies of candidate watermarking words. To address this drawback, in this paper, we propose a novel Conditional wATERmarking framework (CATER) for protecting the IP of text generation APIs. An optimization method is proposed to decide the watermarking rules that can minimize the distortion of overall word distributions while maximizing the change of conditional word selections. Theoretically, we prove that it is infeasible for even the savviest attacker (they know how CATER works) to reveal the used watermarks from a large pool of potential word pairs based on statistical inspection. Empirically, we observe that high-order conditions lead to an exponential growth of suspicious (unused) watermarks, making our crafted watermarks more stealthy. In addition, CATER can effectively identify the IP infringement under architectural mismatch and cross-domain imitation attacks, with negligible impairments on the generation quality of victim APIs. We envision our work as a milestone for stealthily protecting the IP of text generation APIs.

Related Publications

Privacy Assessment on Reconstructed Images: Are Existing Evaluation Metrics Faithful to Human Perception?

NeurIPS, 2023
Xiaoxiao Sun*, Nidham Gazagnadou, Vivek Sharma, Lingjuan Lyu, Hongdong Li*, Liang Zheng*

Hand-crafted image quality metrics, such as PSNR and SSIM, are commonly used to evaluate model privacy risk under reconstruction attacks. Under these metrics, reconstructed images that are determined to resemble the original one generally indicate more privacy leakage. Image…

UltraRE: Enhancing RecEraser for Recommendation Unlearning via Error Decomposition

NeurIPS, 2023
Yuyuan Li*, Chaochao Chen*, Yizhao Zhang*, Weiming Liu*, Lingjuan Lyu, Xiaolin Zheng*, Dan Meng*, Jun Wang*

With growing concerns regarding privacy in machine learning models, regulations have committed to granting individuals the right to be forgotten while mandating companies to develop non-discriminatory machine learning systems, thereby fueling the study of the machine unlearn…

Towards Personalized Federated Learning via Heterogeneous Model Reassembly

NeurIPS, 2023
Jiaqi Wang*, Xingyi Yang*, Suhan Cui*, Liwei Che*, Lingjuan Lyu, Dongkuan Xu*, Fenglong Ma*

This paper focuses on addressing the practical yet challenging problem of model heterogeneity in federated learning, where clients possess models with different network structures. To track this problem, we propose a novel framework called pFedHR, which leverages heterogeneo…

  • HOME
  • Publications
  • CATER: Intellectual Property Protection on Text Generation APIs via Conditional Watermarks


Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.