Faster Machine Translation Ensembling with Reinforcement Learning and Competitive Correction

VIEW PUBLICATION

Kritarth Prasad

Mohammadi Zaki

Pratik Singh

Pankaj Wasnik

2025

Abstract

Ensembling neural machine translation (NMT) models to produce higher-quality translations than the $L$ individual models has been extensively studied. Recent methods typically employ a candidate selection block (CSB) and an encoder-decoder fusion block (FB), requiring inference across \textit{all} candidate models, leading to significant computational overhead, generally $\Omega(L)$. This paper introduces \textbf{SmartGen}, a reinforcement learning (RL)-based strategy that improves the CSB by selecting a small, fixed number of candidates and identifying optimal groups to pass to the fusion block for each input sentence. Furthermore, previously, the CSB and FB were trained independently, leading to suboptimal NMT performance. Our DQN-based \textbf{SmartGen} addresses this by using feedback from the FB block as a reward during training. We also resolve a key issue in earlier methods, where candidates were passed to the FB without modification, by introducing a Competitive Correction Block (CCB). Finally, we validate our approach with extensive experiments on English-Hindi translation tasks in both directions.

Related Publications

DuET: Dual Incremental Object Detection via Exemplar-Free Task Arithmetic

ICCV, 2025
Munish Monga, Vishal Chudasama, Pankaj Wasnik, Biplab Banerjee*

Real-world object detection systems, such as those in autonomous driving and surveillance, must continuously learn new object categories and simultaneously adapt to changing environmental conditions. Existing approaches, Class Incremental Object Detection (CIOD) and Domain I…

Precise Event Spotting in Sports Videos: Solving Long-Range Dependency and Class Imbalance

CVPR, 2025
Sanchayan Santra, Vishal Chudasama, Pankaj Wasnik, Vineeth N Balasubramanian

Precise Event Spotting (PES) aims to identify events and their class from long, untrimmed videos, particularly in sports. The main objective of PES is to detect the event at the exact moment it occurs. Existing methods mainly rely on features from a large pre-trained network…

Cross-Modal Fusion and Attention Mechanism for Weakly Supervised Video Anomaly Detection

CVPRW, 2025
Ayush Ghadiya, Purbayan Kar, Vishal Chudasama, Pankaj Wasnik

Recently, weakly supervised video anomaly detection (WS-VAD) has emerged as a contemporary research direction to identify anomaly events like violence and nudity in videos using only video-level labels. However, this task has substantial challenges, including addressing imba…

SEE ALL

HOME
Publications
Faster Machine Translation Ensembling with Reinforcement Learning and Competitive Correction

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.

LEARN MORE