Kengo
Uchida

Publications

MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training

CVPR, 2025
Kengo Uchida, Takashi Shibuya, Yuhta Takida, Naoki Murata, Julian Tanke, Shusuke Takahashi*, Yuki Mitsufuji

In text-to-motion generation, controllability as well as generation quality and speed has become increasingly critical. The controllability challenges include generating a motion of a length that matches the given textual description and editing the generated motions accordi…

HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes

TMLR, 2024
Yuhta Takida, Yukara Ikemiya, Takashi Shibuya, Kazuki Shimada, Woosung Choi, Chieh-Hsin Lai, Naoki Murata, Toshimitsu Uesaka, Kengo Uchida, Yuki Mitsufuji, Wei-Hsiang Liao

Vector quantization (VQ) is a technique to deterministically learn features with discrete codebook representations. It is commonly performed with a variational autoencoding model, VQ-VAE, which can be further extended to hierarchical structures for making high-fidelity recon…

Zero- and Few-shot Sound Event Localization and Detection

ICASSP, 2024
Kazuki Shimada, Kengo Uchida, Yuichiro Koyama*, Takashi Shibuya, Shusuke Takahashi*, Yuki Mitsufuji, Tatsuya Kawahara*

Sound event localization and detection (SELD) systems estimate direction-of-arrival (DOA) and temporal activation for sets of target classes. Neural network (NN)-based SELD systems have performed well in various sets of target classes, but they only output the DOA and tempor…

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.