Venue
- NeurIPS 2023
Date
- 2023
f-Policy Gradients: A General Framework for Goal-Conditioned RL using f-Divergences
Siddhant Agarwal*
Ishan Durugkar
Amy Zhang*
* External authors
NeurIPS 2023
2023
Abstract
Goal-Conditioned RL problems provide sparse rewards where the agent receives a reward signal only when it has achieved the goal, making exploration a difficult problem. Several works augment this sparse reward with a learned dense reward function, but this can lead to suboptimality in exploration and misalignment of the task. Moreover, recent works have demonstrated that effective shaping rewards for a particular problem can depend on the underlying learning algorithm. Our work ($f$-PG or $f$-Policy Gradients) shows that minimizing f-divergence between the agent's state visitation distribution and the goal can give us an optimal policy. We derive gradients for various f-divergences to optimize this objective. This objective provides dense learning signals for exploration in sparse reward settings. We further show that entropy maximizing policy optimization for commonly used metric-based shaping rewards like L2 and temporal distance can be reduced to special cases of f-divergences, providing a common ground to study such metric-based shaping rewards. We compare $f$-Policy Gradients with standard policy gradients methods on a challenging gridworld as well as the Point Maze environments.
Related Publications
Current approaches to learning cooperative multi-agent behaviors assume relatively restrictive settings. In standard fully cooperative multi-agent reinforcement learning, the learning algorithm controls all agents in the scenario, while in ad hoc teamwork, the learning algor…
The ability to approach the same problem from different angles is a cornerstone of human intelligence that leads to robust solutions and effective adaptation to problem variations. In contrast, current RL methodologies tend to lead to policies that settle on a single solutio…
Racing autonomous cars faster than the best human drivers has been a longstanding grand challenge for the fields of Artificial Intelligence and robotics. Recently, an end-to-end deep reinforcement learning agent met this challenge in a high-fidelity racing simulator, Gran Tu…
JOIN US
Shape the Future of AI with Sony AI
We want to hear from those of you who have a strong desire
to shape the future of AI.