* External authors




An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch

Siddharth Desai*

Ishan Durugkar*

Haresh Karnan*

Garrett Warnell*

Josiah Hanna*

Peter Stone

* External authors




We examine the problem of transferring a policy learned in a source environment to a target environment with different dynamics, particularly in the case where it is critical to reduce the amount of interaction with the target environment during learning. This problem is particularly important in sim-to-real transfer because simulators inevitably model real-world dynamics imperfectly. In this paper, we show that one existing solution to this transfer problem-- grounded action transformation --is closely related to the problem of imitation from observation (IfO): learning behaviors that mimic the observations of behavior demonstrations. After establishing this relationship, we hypothesize that recent state-of-the-art approaches from the IfO literature can be effectively repurposed for grounded transfer learning. To validate our hypothesis we derive a new algorithm -- generative adversarial reinforced action transformation (GARAT) -- based on adversarial imitation from observation techniques. We run experiments in several domains with mismatched dynamics, and find that agents trained with GARAT achieve higher returns in the target environment compared to existing black-box transfer methods.

Related Publications

Outracing Champion Gran Turismo Drivers with Deep Reinforcement Learning

Nature, 2022
Pete Wurman, Samuel Barrett, Kenta Kawamoto, James MacGlashan, Kaushik Subramanian, Thomas J. Walsh, Roberto Capobianco, Alisa Devlic, Franziska Eckert, Florian Fuchs, Leilani Gilpin, Piyush Khandelwal, Varun Kompella, Hao Chih Lin, Patrick MacAlpine, Declan Oller, Takuma Seno, Craig Sherstan, Michael D. Thomure, Houmehr Aghabozorgi, Leon Barrett, Rory Douglas, Dion Whitehead Amago, Peter Dürr, Peter Stone, Michael Spranger, Hiroaki Kitano

Many potential applications of artificial intelligence involve making real-time decisions in physical systems while interacting with humans. Automobile racing represents an extreme example of these conditions; drivers must execute complex tactical manoeuvres to pass or block…

Jointly Improving Parsing and Perception for Natural Language Commands through Human-Robot Dialog

Jesse Thomason*, Aishwarya Padmakumar*, Jivko Sinapov*, Nick Walker*, Yuqian Jiang*, Harel Yedidsion*, Justin Hart*, Peter Stone, Raymond J. Mooney*

In this work, we present methods for using human-robot dialog to improve language understanding for a mobile robot agent. The agent parses natural language to underlying semantic meanings and uses robotic sensors to create multi-modal models of perceptual concepts like red a…

Agent-Based Markov Modeling for Improved COVID-19 Mitigation Policies

JAIR, 2021
Roberto Capobianco, Varun Kompella, James Ault*, Guni Sharon*, Stacy Jong*, Spencer Fox*, Lauren Meyers*, Pete Wurman, Peter Stone

The year 2020 saw the covid-19 virus lead to one of the worst global pandemics in history. As a result, governments around the world have been faced with the challenge of protecting public health while keeping the economy running to the greatest extent possible. Epidemiologi…

  • HOME
  • Publications
  • An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch


Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.