Authors

Venue

Date

Share

Discogs-VINet-MIREX

Xavier Serra

Yuki Mitsufuji

R.O. Araz

J. Serrà

D. Bogdanov

MIREX 2024

2025

Abstract

This technical report presents our submission to the cover song identification task for the 2024 edition of the Music Information Retrieval Evaluation eXchange (MIREX). For this submission, we enhanced our Discogs-VINet model by changing the definition of an epoch, incorporating automatic mixed precision (AMP) during both training and inference, and sampling four versions per clique during triplet mining (which became possible with AMP). Due to this enhanced model’s performance on the Discogs-VI test set, we trained a new model from scratch using the entire Discogs-VI dataset, rather than just the training partition used in Discogs-VINet (a 45% increase in the number of versions). This enhanced and retrained model is named Discogs-VINet-MIREX.

Related Publications

Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space

ICLR, 2025
Yangming Li, Chieh-Hsin Lai, Carola-Bibiane Schönlieb, Yuki Mitsufuji, Stefano Ermon*

Deep Generative Models (DGMs), including Energy-Based Models (EBMs) and Score-based Generative Models (SGMs), have advanced high-fidelity data generation and complex continuous distribution approximation. However, their application in Markov Decision Processes (MDPs), partic…

Training Consistency Models with Variational Noise Coupling

ICLR, 2025
Gianluigi Silvestri, Luca Ambrogioni, Chieh-Hsin Lai, Yuhta Takida, Yuki Mitsufuji

Consistency Training (CT) has recently emerged as a promising alternative to diffusion models, achieving competitive performance in image generation tasks. However, non-distillation consistency training often suffers from high variance and instability, and analyzing and impr…

Classifier-Free Guidance inside the Attraction Basin May Cause Memorization

CVPR, 2025
Anubhav Jain, Yuya Kobayashi, Takashi Shibuya, Yuhta Takida, Nasir Memon, Julian Togelius, Yuki Mitsufuji

Diffusion models are prone to exactly reproduce images from the training data. This exact reproduction of the training data is concerning as it can lead to copyright infringement and/or leakage of privacy-sensitive information. In this paper, we present a novel way to unders…

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.