Authors

* External authors

Venue

Date

Share

An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch

Siddharth Desai*

Ishan Durugkar

Haresh Karnan*

Garrett Warnell*

Josiah Hanna*

Peter Stone

* External authors

NeurIPS-2020

2020

Abstract

We examine the problem of transferring a policy learned in a source environment to a target environment with different dynamics, particularly in the case where it is critical to reduce the amount of interaction with the target environment during learning. This problem is particularly important in sim-to-real transfer because simulators inevitably model real-world dynamics imperfectly. In this paper, we show that one existing solution to this transfer problem-- grounded action transformation --is closely related to the problem of imitation from observation (IfO): learning behaviors that mimic the observations of behavior demonstrations. After establishing this relationship, we hypothesize that recent state-of-the-art approaches from the IfO literature can be effectively repurposed for grounded transfer learning. To validate our hypothesis we derive a new algorithm -- generative adversarial reinforced action transformation (GARAT) -- based on adversarial imitation from observation techniques. We run experiments in several domains with mismatched dynamics, and find that agents trained with GARAT achieve higher returns in the target environment compared to existing black-box transfer methods.

Related Publications

N-agent Ad Hoc Teamwork

NeurIPS, 2024
Caroline Wang*, Arrasy Rahman*, Ishan Durugkar, Elad Liebman*, Peter Stone

Current approaches to learning cooperative multi-agent behaviors assume relatively restrictive settings. In standard fully cooperative multi-agent reinforcement learning, the learning algorithm controls all agents in the scenario, while in ad hoc teamwork, the learning algor…

Discovering Creative Behaviors through DUPLEX: Diverse Universal Features for Policy Exploration

NeurIPS, 2024
Borja G. Leon*, Francesco Riccio, Kaushik Subramanian, Pete Wurman, Peter Stone

The ability to approach the same problem from different angles is a cornerstone of human intelligence that leads to robust solutions and effective adaptation to problem variations. In contrast, current RL methodologies tend to lead to policies that settle on a single solutio…

A Super-human Vision-based Reinforcement Learning Agent for Autonomous Racing in Gran Turismo

RLC, 2024
Miguel Vasco*, Takuma Seno, Kenta Kawamoto, Kaushik Subramanian, Pete Wurman, Peter Stone

Racing autonomous cars faster than the best human drivers has been a longstanding grand challenge for the fields of Artificial Intelligence and robotics. Recently, an end-to-end deep reinforcement learning agent met this challenge in a high-fidelity racing simulator, Gran Tu…

  • HOME
  • Publications
  • An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.