Authors

* External authors

Venue

Date

Share

IDEAL: Query-Efficient Data-Free Learning from Black-Box Models

Jie Zhang*

Chen Chen

Lingjuan Lyu

* External authors

ICLR 2023

2023

Abstract

Knowledge Distillation (KD) is a typical method for training a lightweight student model with the help of a well-trained teacher model. However, most KD methods require access to either the teacher's training data or model parameter, which is unrealistic. To tackle this problem, recent works study KD under data-free and black-box settings. Nevertheless, these works require a large number of queries to the teacher model, which incurs significant monetary and computational costs. To address these problems, we propose a novel method called query-effIcient Data-free lEarning blAck-box modeLs (IDEAL), which aims to query-efficiently learn from black-box model APIs to train a good student without any real data. In detail, IDEAL trains the student model in two stages: data generation and model distillation. Note that IDEAL does not require any query in the data generation stage and queries the teacher only once for each sample in the distillation stage. Extensive experiments on various real-world datasets show the effectiveness of the proposed IDEAL. For instance, IDEAL can improve the performance of the best baseline method DFME by 5.83% on CIFAR10 dataset with only 0.02× the query budget of DFME. Our code will be published upon acceptance.

Related Publications

FedMef: Towards Memory-efficient Federated Dynamic Pruning

CVPR, 2024
Hong Huang, Weiming Zhuang, Chen Chen, Lingjuan Lyu

Federated learning (FL) promotes decentralized training while prioritizing data confidentiality. However, its application on resource-constrained devices is challenging due to the high demand for computation and memory resources for training deep learning models. Neural netw…

DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models

ICLR, 2024
Zhenting Wang, Chen Chen, Lingjuan Lyu, Dimitris N. Metaxas*, Shiqing Ma*

Recent text-to-image diffusion models have shown surprising performance in generating high-quality images. However, concerns have arisen regarding the unauthorized data usage during the training or fine-tuning process. One example is when a model trainer collects a set of im…

Combating Data Imbalances in Federated Semi-supervised Learning with Dual Regulators

AAAI, 2024
Sikai Bai*, Shuaicheng Li*, Weiming Zhuang, Jie Zhang*, Kunlin Yang*, Jun Hou*, Shuai Yi*, Shuai Zhang*, Junyu Gao*

Federated learning has become a popular method to learn from decentralized heterogeneous data. Federated semi-supervised learning (FSSL) emerges to train models from a small fraction of labeled data due to label scarcity on decentralized clients. Existing FSSL methods assume…

  • HOME
  • Publications
  • IDEAL: Query-Efficient Data-Free Learning from Black-Box Models

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.