* External authors




Building Minimal and Reusable Causal State Abstractions for Reinforcement Learning

Zizhao Wang*

Caroline Wang*

Xuesu Xiao*

Yuke Zhu*

Peter Stone

* External authors

AAAI 2024



Two desiderata of reinforcement learning (RL) algorithms are the ability to learn from relatively little experience and the ability to learn policies that generalize to a range of problem specifications. In factored state spaces, one approach towards achieving both goals is to learn state abstractions, which only keep the necessary variables for learning the tasks at hand. This paper introduces Causal Bisimulation Modeling (CBM), a method that learns the causal relationships in the dynamics and reward functions for each task to derive a minimal, task-specific abstraction. CBM leverages and improves implicit modeling to train a high-fidelity causal dynamics model that can be reused for all tasks in the same environment. Empirical validation on manipulation environments and Deepmind Control Suite reveals that CBM's learned implicit dynamics models identify the underlying causal relationships and state abstractions more accurately than explicit ones. Furthermore, the derived state abstractions allow a task learner to achieve near-oracle levels of sample efficiency and outperform baselines on all tasks.

Related Publications

A Super-human Vision-based Reinforcement Learning Agent for Autonomous Racing in Gran Turismo

RLC, 2024
Miguel Vasco*, Takuma Seno, Kenta Kawamoto, Kaushik Subramanian, Peter Stone, Peter Wurman

Racing autonomous cars faster than the best human drivers has been a longstanding grand challenge for the fields of Artificial Intelligence and robotics. Recently, an end-to-end deep reinforcement learning agent met this challenge in a high-fidelity racing simulator, Gran Tu…

Wait That Feels Familiar: Learning to Extrapolate Human Preferences for Preference-Aligned Path Planning.

ICRA, 2024
Haresh Karnan*, Elvin Yang*, Garrett Warnell*, Joydeep Biswas*, Peter Stone

Autonomous mobility tasks such as lastmile delivery require reasoning about operator indicated preferences over terrains on which the robot should navigate to ensure both robot safety and mission success. However, coping with out of distribution data from novel terrains or a…

Now, Later, and Lasting: 10 Priorities for AI Research, Policy, and Practice.

COACM, 2024
Eric Horvitz*, Vincent Conitzer*, Sheila McIlraith*, Peter Stone

Advances in artificial intelligence (AI) will transform many aspects of our lives and society, bringing immense opportunities but also posing significant risks and challenges. The next several decades may well be a turning point for humanity, comparable to the industrial rev…

  • HOME
  • Publications
  • Building Minimal and Reusable Causal State Abstractions for Reinforcement Learning


Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.