Authors

* External authors

Venue

Date

Share

Skill-Critic: Refining Learned Skills for Hierarchical Reinforcement Learning

Ce Hao*

Catherine Weaver*

Chen Tang*

Kenta Kawamoto

Masayoshi Tomizuka*

Wei Zhan*

* External authors

RAL 2024

2024

Abstract

Hierarchical reinforcement learning (RL) can accelerate long-horizon decision-making by temporally abstracting a policy into multiple levels. Promising results in sparse reward environments have been seen with skills , i.e. sequences of primitive actions. Typically, a skill latent space and policy are discovered from offline data. However, the resulting low-level policy can be unreliable due to low-coverage demonstrations or distribution shifts. As a solution, we propose the Skill-Critic algorithm to fine-tune the low-level policy in conjunction with high-level skill selection. Our Skill-Critic algorithm optimizes both the low-level and high-level policies; these policies are initialized and regularized by the latent space learned from offline demonstrations to guide the parallel policy optimization. We validate Skill-Critic in multiple sparse-reward RL environments, including a new sparse-reward autonomous racing task in Gran Turismo Sport. The experiments show that Skill-Critic's low-level policy fine-tuning and demonstration-guided regularization are essential for good performance.

Related Publications

A Super-human Vision-based Reinforcement Learning Agent for Autonomous Racing in Gran Turismo

RLC, 2024
Miguel Vasco*, Takuma Seno, Kenta Kawamoto, Kaushik Subramanian, Pete Wurman, Peter Stone

Racing autonomous cars faster than the best human drivers has been a longstanding grand challenge for the fields of Artificial Intelligence and robotics. Recently, an end-to-end deep reinforcement learning agent met this challenge in a high-fidelity racing simulator, Gran Tu…

BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay

RAL, 2024
Catherine Weaver*, Chen Tang*, Ce Hao*, Kenta Kawamoto, Masayoshi Tomizuka*, Wei Zhan*

Autonomous racing poses a significant challenge for control, requiring planning minimum-time trajectories under uncertain dynamics and controlling vehicles at their handling limits. Current methods requiring hand-designed physical models or reward functions specific to each …

Real-time Trajectory Generation via Dynamic Movement Primitives for Autonomous Racing

ACC, 2024
Catherine Weaver*, Roberto Capobianco, Peter R. Wurman, Peter Stone, Masayoshi Tomizuka*

We employ sequences of high-order motion primitives for efficient online trajectory planning, enabling competitive racecar control even when the car deviates from an offline demonstration. Dynamic Movement Primitives (DMPs) utilize a target-driven non-linear differential equ…

  • HOME
  • Publications
  • Skill-Critic: Refining Learned Skills for Hierarchical Reinforcement Learning

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.