Authors

* External authors

Venue

Date

Share

An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch

Siddharth Desai*

Ishan Durugkar

Haresh Karnan*

Garrett Warnell*

Josiah Hanna*

Peter Stone

* External authors

NeurIPS-2020

2020

Abstract

We examine the problem of transferring a policy learned in a source environment to a target environment with different dynamics, particularly in the case where it is critical to reduce the amount of interaction with the target environment during learning. This problem is particularly important in sim-to-real transfer because simulators inevitably model real-world dynamics imperfectly. In this paper, we show that one existing solution to this transfer problem-- grounded action transformation --is closely related to the problem of imitation from observation (IfO): learning behaviors that mimic the observations of behavior demonstrations. After establishing this relationship, we hypothesize that recent state-of-the-art approaches from the IfO literature can be effectively repurposed for grounded transfer learning. To validate our hypothesis we derive a new algorithm -- generative adversarial reinforced action transformation (GARAT) -- based on adversarial imitation from observation techniques. We run experiments in several domains with mismatched dynamics, and find that agents trained with GARAT achieve higher returns in the target environment compared to existing black-box transfer methods.

Related Publications

Automated Reward Design for Gran Turismo

NeurIPS, 2025
Michel Ma, Takuma Seno, Kaushik Subramanian, Peter R. Wurman, Peter Stone, Craig Sherstan

When designing reinforcement learning (RL) agents, a designer communicates the desired agent behavior through the definition of reward functions - numerical feedback given to the agent as reward or punishment for its actions. However, mapping desired behaviors to reward func…

Proto Successor Measure: Representing the Space of All Possible Solutions of Reinforcement Learning

ICML, 2025
Siddhant Agarwal*, Harshit Sikchi, Peter Stone, Amy Zhang*

Having explored an environment, intelligent agents should be able to transfer their knowledge to most downstream tasks within that environment. Referred to as ``zero-shot learning," this ability remains elusive for general-purpose reinforcement learning algorithms. While rec…

Hyperspherical Normalization for Scalable Deep Reinforcement Learning

ICML, 2025
Hojoon Lee, Youngdo Lee, Takuma Seno, Donghu Kim, Peter Stone, Jaegul Choo

Scaling up the model size and computation has brought consistent performance improvements in supervised learning. However, this lesson often fails to apply to reinforcement learning (RL) because training the model on non-stationary data easily leads to overfitting and unstab…

  • HOME
  • Publications
  • An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.