Authors
- Wenjun Peng*
- Jingwei Yi*
- Fangzhao Wu*
- Shangxi Wu*
- Bin Bin Zhu*
- Lingjuan Lyu
- Binxing Jiao*
- Guangzhong Sun*
- Xing Xie*
* External authors
Venue
- ACL'23
Date
- 2023
Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark.
Wenjun Peng*
Jingwei Yi*
Fangzhao Wu*
Shangxi Wu*
Bin Bin Zhu*
Binxing Jiao*
Guangzhong Sun*
Xing Xie*
* External authors
ACL'23
2023
Abstract
Large language models (LLMs) have demonstrated powerful capabilities in both text understanding and generation. Companies have begun to offer Embedding as a Service (EaaS) based on these LLMs, which can benefit various natural language processing (NLP) tasks for customers. However, previous studies have shown that EaaS is vulnerable to model extraction attacks, which can cause significant losses for the owners of LLMs, as training these models is extremely expensive. To protect the copyright of LLMs for EaaS, we propose an Embedding Watermark method called {pasted macro ‘METHOD’} that implants backdoors on embeddings. Our method selects a group of moderate-frequency words from a general text corpus to form a trigger set, then selects a target embedding as the watermark, and inserts it into the embeddings of texts containing trigger words as the backdoor. The weight of insertion is proportional to the number of trigger words included in the text. This allows the watermark backdoor to be effectively transferred to EaaS-stealer’s model for copyright verification while minimizing the adverse impact on the original embeddings’ utility. Our extensive experiments on various datasets show that our method can effectively protect the copyright of EaaS models without compromising service quality.Our code is available at https://github.com/yjw1029/EmbMarker.
Related Publications
The popularity of visual generative AI models like DALL-E 3, Stable Diffusion XL, Stable Video Diffusion, and Sora has been increasing. Through extensive evaluation, we discovered that the state-of-the-art visual generative models can generate content that bears a striking r…
Text-to-image (T2I) diffusion models have shown exceptional capabilities in generating images that closely correspond to textual prompts. However, the advancement of T2I diffusion models presents significant risks, as the models could be exploited for malicious purposes, suc…
With the rapid advancement of generative AI, it is now pos-sible to synthesize high-quality images in a few seconds.Despite the power of these technologies, they raise signif-icant concerns regarding misuse. Current efforts to dis-tinguish between real and AI-generated image…
JOIN US
Shape the Future of AI with Sony AI
We want to hear from those of you who have a strong desire
to shape the future of AI.