Venue

Date

Share

SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer

Yuhta Takida

Masaaki Imaizumi*

Takashi Shibuya

Chieh-Hsin Lai

Toshimitsu Uesaka

Naoki Murata

Yuki Mitsufuji

* External authors

ICLR 2024

2024

Abstract

Generative adversarial networks (GANs) learn a target probability distribution by optimizing a generator and a discriminator with minimax objectives. This paper addresses the question of whether such optimization actually provides the generator with gradients that make its distribution close to the target distribution. We derive metrizable conditions, sufficient conditions for the discriminator to serve as the distance between the distributions by connecting the GAN formulation with the concept of sliced optimal transport. Furthermore, by leveraging these theoretical results, we propose a novel GAN training scheme, called slicing adversarial network (SAN). With only simple modifications, a broad class of existing GANs can be converted to SANs. Experiments on synthetic and image datasets support our theoretical results and the SAN's effectiveness as compared to usual GANs. Furthermore, we also apply SAN to StyleGAN-XL, which leads to state-of-the-art FID score amongst GANs for class conditional generation on ImageNet 256×256.

Related Publications

PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher

NeurIPS, 2024
Dongjun Kim*, Chieh-Hsin Lai, Wei-Hsiang Liao, Yuhta Takida, Naoki Murata, Toshimitsu Uesaka, Yuki Mitsufuji, Stefano Ermon*

To accelerate sampling, diffusion models (DMs) are often distilled into generators that directly map noise to data in a single step. In this approach, the resolution of the generator is fundamentally limited by that of the teacher DM. To overcome this limitation, we propose …

GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping

NeurIPS, 2024
Junyoung Seo, Kazumi Fukuda, Takashi Shibuya, Takuya Narihira, Naoki Murata, Shoukang Hu, Chieh-Hsin Lai, Seungryong Kim*, Yuki Mitsufuji

Generating novel views from a single image remains a challenging task due to the complexity of 3D scenes and the limited diversity in the existing multi-view datasets to train a model on. Recent research combining large-scale text-to-image (T2I) models with monocular depth e…

The whole is greater than the sum of its parts: improving music source separation by bridging networks

EURASIP, 2024
Ryosuke Sawata, Naoya Takahashi, Stefan Uhlich*, Shusuke Takahashi*, Yuki Mitsufuji

This paper presents the crossing scheme (X-scheme) for improving the performance of deep neural network (DNN)-based music source separation (MSS) with almost no increasing calculation cost. It consists of three components: (i) multi-domain loss (MDL), (ii) bridging operation…

  • HOME
  • Publications
  • SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.